Editing and formatting text on the web comprises of an interesting sub-field of web development. The web allows for different ways to turn text into HTML markup. First I’ll outline some of the background of interactive text formatting on the web from my perspective, and then go into some configuration details of MediaWiki’s Visual Editor that I found interesting.
Libraries like MathJax allow for LaTeX-style math syntax to be meticulously, accurately rendered with SVG. User-facing WYSIWYG editors like medium-editor and markdown-toolbar give users a small toolbar with a few formatting options. markdown-toolbar adheres to the CommonMark spec, allowing for the HTML transformation to happen either server-side or client-side, yielding identical results in both cases.
Wikipedia started in 2001, and its content survives today in its original form of custom “wiki” markup, complete with a stable transformation to HTML. Its underlying software, which turned into a project called MediaWiki, handles the wikitext-to-HTML transformation with a script known as the PHP parser. Until a few years ago, anyone editing Wikipedia was expected just to deal with the wikitext syntax. It’s similar to Markdown, and not that bad for making small edits. But, this did discourage people who didn’t want to learn this from writing longer articles from scratch. And as the 2010s arrived, it became obvious that there were improvements that could be made. So the VisualEditor project started.
The MediaWiki developers didn’t have the luxury of a well-defined markup language like CommonMark. We are able to leverage that in the PMT, and I imagine that if MediaWiki took some cues here, they might be able to simplify the VisualEditor process by removing the dependency on their node.js service (called Parsoid), that translates HTML into wiki markup. I haven’t dug deeper, though, and it’s possible that wikitext is just too complicated, or the scope of the problem is bigger than what I’m seeing.
I don’t really know the specifics of Parsoid, but you can read about its design here. From my understanding, it uses MediaWiki’s API to fetch pages and give them to the VisualEditor in the client’s browser via XHR. I was hesitant to even set it up, because that’s a pretty big change in how edits happen in the wiki.
I wouldn’t have been able to get things working without Parsoid’s Troubleshooting page. Through my test curl requests, I was able to debug about three hurdles I encountered. First, our wiki is private, so Parsoid needed to be authenticated somehow, for read access to the pages. Fortunately, using Parsoid on private wikis is documented, and I was able to allow Parsoid’s access by whitelisting a specific user’s access if the request comes from localhost, using the NetworkAuth extension. This solution seemed simpler and less sketchy to me than the first documented solution: forwarding the user’s cookies to Parsoid. I also needed to change Parsoid’s default configuration for the path to MediaWiki’s api.php, and also to request the wiki’s actual name instead of localhost, so that nginx resolves the name correctly. Note that in this particular setup, this all happens on localhost, which is inside a private network and behind a proxy server, so there’s no reason for Parsoid’s requests to be encrypted through HTTPS. Which is pretty fortunate, because allowing Parsoid to make HTTPS requests requires another service called stunnel, and some additional setup with certs.
The end result is that the Visual Editor looks really nice. And, personally I still expect to use the “Edit source” button a lot to make quick edits, just cause I’m so used to it, and it’s faster. The Visual Editor takes a few seconds to load. It will be interesting to see if this actually makes things easier for users, and if there is much of an added burden on maintenance as I make periodic updates to MediaWiki and Parsoid.