Files
Brig/spec.md
2026-02-01 16:11:47 -05:00

3.4 KiB

Spec: Wikimedia article manipulation library

Goal: Create a rust library providing high level and low level functions manipulating the properties and the content of wikimedia pages. This library must ne built on top of the 2 existing rust libraries parse_wiki_text and parse_wiki_text to implement those functionalities

User Story 1: As a user, I want to get the raw text content of a Wikimedia article by specifying the site, language and article name. User Story 2: As a user, I want to get the list of all the links of an article User Story 3: As a user, I want to get the list of all the categories of an article User Story 4: As a user, I want to get the list of all the references of an article User Story 5: As a user, I want to get the list of all the templates of an article User Story 6: As a user, I want to get the list of all the parameters of a template User Story 7: As a user, I want to get the list of all the sections of an article

Functional Requirements:

  1. Use mediawiki_rest_api to get access to the Wikimedia article
  2. Use mediawiki_rest_api to get the initial content of the article
  3. Use parse_wiki_text to parse the Wikimedia article
  4. Use parse_wiki_text to interact with any internal content of the article
  5. Use parse_wiki_text nodes as a main way to manage data access
  6. The central data structure is a WikiPage with its associated trait of the same name. This structure has the following fields:
    • language -> Language of the page
    • title -> Title of the page
    • content -> raw text content of the article, output of the MediaWikiPage get function
    • parsed -> node based structure, output of the Configuration parse function
  7. Organize the functions to be implementation as a Trait
  8. Only language and article name are known at initialization
  9. All other functions including fetching and parsing the page are performed after initial creation of the structure
  10. Some specific low level functions must created to help building higher level functionalities. These functions are:
    • get_links
    • get_categories
    • get_references
    • get_templates
    • get_template_parameters
    • get_sections
  11. When interacting with a page the steps are:
    • Create a new WikiPage with the language and title
    • Load the content and store the results in the content field
    • Parse the content and store the results in the parsed field
  12. After those initializations steps, all functions should interact with the fields directly and do not reload the values.

Critical Rules:

Testing Requirements:

  1. Always write unit tests for every functions
  2. Write all the tests in the dedicated /tests folder. Do not include any test in the main code folders and files

Technical Constraints:

  • rust language
  • mediawiki_rest_api
  • parse_wiki_text
  • use the GEMINI.md file for specific code rules

Test Cases:

  • Scenario A: Valid access -> Status 200 + content of the article.
  • Scenario B: Valid access -> Status 200 + categories of the article.
  • Scenario C: Valid access -> Status 200 + links of the article.
  • Scenario D: Valid access -> Status 200 + references of the article.
  • Scenario E: Valid access -> Status 200 + templates of the article.
  • Scenario F: Valid access -> Status 200 + parameters of a template.
  • Scenario G: Valid access -> Status 200 + sections of the article.
  • Scenario Z: Incorrect article -> Status 401 + error message.