Path of Exile Wiki:Semantic Mediawiki rework
- 1 Overview
- 2 Methodology
- 2.1 Data sources
- 2.2 Data output
- 3 Pages that need reworking
- 4 Known Issues with SMW
With the addition of Semantic Mediawiki, the Path of Exile wiki gained the ability to dynamically populate pages and link text from pages into a database.
This gives us two major advantages:
- removal of content duplication
- information will only be required to be updated once per page
The purpose of this page is to coordinate the effort; to discuss specific changes, use the talk page.
For the purposes for the rework, a data source is a single page where the properties are set. A page can be both, a data source and a output page.
Finding suitable (groups of) pages pages
Suitable pages are pages that can potentially expose information that would be useful to query. Entire groups of pages where similar information is presented based on the page it self are optimal; for example this is the case for item pages or skill pages.
Identifying the properties to use
Once the page (or group of) has been identified, the next step is to identify suitable properties on the page. Referencing the information in the game files (many of the groups have a respective binary .dat file) may be helpful to identify such properties, but however it should kept in mind that not all information is needed in the wiki.
Finally it's also important to agree on properties beforehand, if a template is changed to change the name of a properties, it might take several hours for the change to go though.
Also make sure to declare properties, i.e. to create a page with name of the property and in the Property name space (e.g. Property:Has name). While declaring properties, also make to use a proper datatype and briefly describe what the property is for.
Naming of properties
For reference, the guidelines from the Manual of Style:
Creating or reworking lua modules & templates
Once the properties have been identified, the next step is to create appropriate lua modules and templates. Most of the handling should be done in lua, since it's much more flexible and readable then wikipedia templates.
The lua module should be included from a template which can add extra information or fill in some of the variables (for example, a "Skill" template might fill in "type=skill" automatically). Additional reason for the inclusion though a template is that the underlying module (and/or function) can be changed at any time allowing for more flexibility.
Inside the lua module, the variables should be verified before being added as property and raise an error if inappropriately used Template:Discussion.
Updating all the pages with the source information
In the last step, if the changes also require the source pages to be changed, each of these should be updated to use the new template (and missing information should be added accordingly).
It may be helpful to add a management category (i.e. something like [[Category:Non-semantic skill pages]]) to the template used for old pages, so they can be found and identified more easily.
For the purposes for the rework, a page contains data output if the data is accessed though semantic search A page can be both, a data source and a output page.
Generally it's advisable to update the source pages first or leave out semantic queries until the source pages are done. Obviously you can't query what doesn't exist, which might make it hard to identify issues with the queries themselves. In particular if the queries are not done yet this poses an issue.
Finding pages that list source pages or otherwise use them
The first step is to identify pages that use the information.
Pages to look for (easiest to hardest):
- item lists (usually named List of xxx)
- nav boxes (usually reside in the Template namespace and are named Template:Navbox xxx)
- pages that list parts of other pages, but do not have it in their name (e.x. league specific items, overviews of the classes, etc.)
- pages that only use a single property (e.x. "the game has 412 uniques", not on page "Scion" "scion starts with a base health of x")
Query source pages
Once the pages have been found, they should be updated with the queries.
Some things to consider and watch out for:
- the default limit of queries is 50 (increase it as necessary or divide into sub queries).
- the order may random or not what you expect. In general it may be sensible to order by the name (in particular if it has a name property)
- sometimes you may need subqueries to filter out nested information (in particular for subobjects) which will require templates
- table and list output formats are useful for most pages
- for highly customized formatting, use the template format (and see the section below)
Create output templates
As mentioned above, some pages will require custom templates to output the information properly.
Create those templates with the prefix SMW and use
Pages that need reworking
We should try to prioritize the pages, i.e. what is accessed the most and what could benefit the most. List pages or groups here.
priority: very high completion:
- templates: 100%
- data exporter: 100%
skills are a subset of items
Overview of progress
Estimated overview of progress.
|Semantic Templates||PyPoE Exporter||Data source pages||Output / formatting templates||Lists/interlinked pages|