Analysis of design patterns in MediaWiki

Skills/Subjects:

About MediaWiki
MediaWiki is the PHP framework of top-ten website Wikipedia. When founder Jimmy Wales created Wikipedia as an academic experiment the site was based on UseModWiki, a general-use wiki framework written in Perl, and not quite yet well-known. Once it found popularity via high-profile bloggers and news sites, their system was failing in overdrive, prompting volunteer community members to implement ad-hoc fixes for scalability and improved features for collaborative editing.

Over the following few years, the improved code base evolved into “MediaWiki.” MediaWiki is a still an evolving application that has since been abstracted and streamlined enough for developers and non-developers to build their own site based on MediaWiki. In fact, web hosting providers often include installers for MediaWiki, among others, further popularizing MediaWiki as a framework for collaborative projects.

Overview of design patterns
As a top-ten website, Mediawiki is organized by a performance-optimized multi-tiered LAMP architecture (see image), involving many internet protocols and systems. Considering its history of rapid, sometimes hacked, implementation, much of MediaWiki is legacy code. On top of that, itsanelson35 developers often work in “fire-fighting mode” and prioritize bugs, performance and scalability, features, and then framework upgrades. Therefore some MediaWiki developers suggest that the ground-up overall design is not as efficient as it could be.

Aside from a base rewrite of the engine itself, which may be out of scope of this analysis, MediaWiki could most benefit from rebuilding the wikitext-to-html parser, which is based heavily on regular expressions and other helper functions. This issue is present in displaying basic wiki markup and using templates. Its configuration also violates STRONG principles, including the open/close principle in its use of global PHP variables and the interface segregation principle in limiting potential abstractions for configuration.

Analysis: Parser functions
To make collaborative editing more accessible, MediaWiki’s special markup language “wikitext” allows users to more easily edit pages, templates, and other content. The collection of parsing functions must stay as stable as possible, but the firefighting nature of code maintenance has left the logic as a pile of regular expressions and special cases rather than an efficient grammar, complicating updates. Although the parsing does work, major issues exist with mixing wikitext and stacking wikitext in templates.

Mixing wikitext, such as “[[:Category:Time | {{CURRENTTIME}}]]” to link to the page of the category called “Time” with the text displayed as the current time, requires many steps through the parser and its regular expressions to render its HTML output. Though the functionality and reliability of this function collection is fairly sound, the aforementioned example shows how performance can suffer from the complexity and uncontrolled nature of the logic. Furthermore, the obfuscated code is not very usable for developers who need to support and extend its capabilities. The markup language as well as its parser require a complete rewrite in order to achieve a well-structured grammar. Projects such as a wikitext WYSIWYG editor rely entirely on such a rewrite.

The parser also requires better support for templates, which are pages in a special “Template” namespace invoked via double curly braces {{TemplateName}}, that allow content to be formatted and have some logic applied to modify its presentation. For example, on the Wikipedia page for MediaWiki, the gray box to the right is the result of passing text as parameters to the template called “Infobox software.” The trouble with templates is that many use conditions and formatting that use a very limited set of unique characters and delimiters, causing the parser to confuse the intentions of long expressions. To counter this, pseudo-escape-templates, such as “{{!}}” to replace the vertical bar character, are required to ensure intended logic. As a developer and wiki user, I consider wikitext to be a feature-rich and usable esoteric markup language; however, its overuse of particular characters ruin its reliability. Though the parsing functions for conditional logic are separated as the ParserFunctions extension, the parsing is still handled without as much interface segregation as it ought to in accordance with STRONG principles.

Analysis: STRONG principles
The most serious violations of STRONG principles are in configuration via PHP’s global variables, thus defeating the open/close principle. Global variables are variables set in a particular file and then included in all other high-level files, allowing many files to be dependent on it, therefore violating the dependency inversion principle. While a global variables file does allow quick access to frequently modified functionality via a single settings file, it is poor design. Referencing, copying, and temporarily changing global variables for a single function reduces security and capabilities of a system that seems more object-oriented on the user side.

Since modifying globals is by extension modifying core files, rather than extending it, users may unwittingly alter behavior of other modules inheriting abstract or interface classes, therefore also complicating dependency inversion. Separate MediaWiki extensions help separate classes that perform certain tasks, such as including conditional logic in the ParserFunctions extension, extensions improperly handling or not recognizing the hundreds of global variables limits reliability of the entire system and further complications supportability.

Conclusion
MediaWiki, though an inseparable element of Wikipedia, has proven itself to be a leading framework for collaborative editing. However, in order to be more attractive to developers and robust enough to handle its users intentions accurately, it must implement more prudent design patterns to find success outside of its progenitor Wikipedia. MediaWiki’s development is as scarce and collaborative as Wikipedia’s content editors, though, and its deference to Wikipedia requires all updates to be as scalable and robust as possible, preventing a particularly effective push to sweeping change.