Update spec.md

This commit is contained in:
2026-02-01 16:11:47 -05:00
parent e8ab5e452e
commit cebd7dea57

42
spec.md
View File

@@ -4,25 +4,54 @@
and the content of wikimedia pages. This library must ne built on top of the 2 existing rust libraries
parse_wiki_text and parse_wiki_text to implement those functionalities
**User Story 1**: As a user, I want to get the raw text content of a wikimedia article by specifying the site, language and article name.
**User Story 1**: As a user, I want to get the raw text content of a Wikimedia article by specifying the site, language and article name.
**User Story 2**: As a user, I want to get the list of all the links of an article
**User Story 3**: As a user, I want to get the list of all the categories of an article
**User Story 4**: As a user, I want to get the list of all the references of an article
**User Story 5**: As a user, I want to get the list of all the templates of an article
**User Story 6**: As a user, I want to get the list of all the parameters of a template
**User Story 7**: As a user, I want to get the list of all the sections of an article
**Functional Requirements**:
1. Use mediawiki_rest_api to get access to the wikimedia article
1. Use mediawiki_rest_api to get access to the Wikimedia article
2. Use mediawiki_rest_api to get the initial content of the article
3. Use parse_wiki_text to parse the wikimedia article
4. Use parse_wiki_text to interact with any internal content of the article
5. Use parse_wiki_text nodes
6. Always write unit tests for every functions
3. Use parse_wiki_text to parse the Wikimedia article
4. Use parse_wiki_text to interact with any internal content of the article
5. Use parse_wiki_text nodes as a main way to manage data access
6. The central data structure is a WikiPage with its associated trait of the same name. This structure has the following fields:
- language -> Language of the page
- title -> Title of the page
- content -> raw text content of the article, output of the MediaWikiPage get function
- parsed -> node based structure, output of the Configuration parse function
7. Organize the functions to be implementation as a Trait
8. Only language and article name are known at initialization
9. All other functions including fetching and parsing the page are performed after initial creation of the structure
10. Some specific low level functions must created to help building higher level functionalities. These functions are:
- get_links
- get_categories
- get_references
- get_templates
- get_template_parameters
- get_sections
11. When interacting with a page the steps are:
- Create a new WikiPage with the language and title
- Load the content and store the results in the content field
- Parse the content and store the results in the parsed field
12. After those initializations steps, all functions should interact with the fields directly and do not reload the values.
**Critical Rules**:
**Testing Requirements**:
1. Always write unit tests for every functions
2. Write all the tests in the dedicated /tests folder. Do not include any test in the main code folders and files
3.
**Technical Constraints**:
- rust language
- mediawiki_rest_api
- parse_wiki_text
- use the GEMINI.md file for specific code rules
-
**Test Cases**:
- **Scenario A**: Valid access -> Status 200 + content of the article.
@@ -31,4 +60,5 @@ parse_wiki_text and parse_wiki_text to implement those functionalities
- **Scenario D**: Valid access -> Status 200 + references of the article.
- **Scenario E**: Valid access -> Status 200 + templates of the article.
- **Scenario F**: Valid access -> Status 200 + parameters of a template.
- **Scenario G**: Valid access -> Status 200 + sections of the article.
- **Scenario Z**: Incorrect article -> Status 401 + error message.