Sunday 9 August 2009

The Status of 0.2

Last Wednesday I pushed out the first release candidate of 0.2. The main motivation behind the release was to get the new caching code (needed by matplotlib) out into the wild as soon as possible.

However, soon after my mentor, Michael Droettboom, sent me a patch adding support for the MathML torture tests to mathtex. These tests resulted in several issues being filed against mathtex. Further, Michael was kind enough to provide patches for many of then — which I hastily added to mathtex.

The result? The mathtex of today has much better rendering than the mathtex of a week ago. However, it does also mean that the differences between 0.2 RC 1 and 0.2 will be substantial — enough to warrant a version-bump. It is for this reason that I am planning to skip the 0.2 release series and go straight to 0.3 RC 1.

I expect to tag the release in the next day or so, but there is at least one issue which I would like to squish before doing so.

Saturday 1 August 2009

Rendering Improvements

Over the past couple of days I have been working to improve the rendering quality of mathtex. Rather than describe the changes with several paragraphs of prose I thought it more apt to have a before-and-after picture instead.

Before:

After:As can be seen the result is much closer to that produced by LaTeX.

Wednesday 29 July 2009

YAPP (Yet Another Progress Report)

It only just hit that it has been well over a week since my last blog post. This is unfortunate, as I like to blog at least once a week. Without further ado here is what has been happening in mathtex this week.

Firstly, the matplotlib integration was tidied up. Mathtex is currently an optional component of matplotlib and has completely replaced the internal TeX rendering engine. I am hopeful that with a few tweaks it can be merged in the near future into the mainline matplotlib repository.

Secondly, the first public version of mathtex was released — 0.1 RC 1. However, due to an (embarrassing) slip-up on my part it was quickly superseded by 0.1 RC 2. For those who are interested it can be downloaded from http://code.google.com/p/mathtex/downloads/list Assuming all is well I hope to have a final release out by the end of the week.

Finally, the mailing list has been more active than ever! Over the past few days there have been several interesting discussions on equation breaking, build errors and how to use mathtex from a C/C++ application. Although not broad enough to constitute a blog post the topics make interesting reads nether the less.

Saturday 18 July 2009

Branching Off

As the more astute (or stalker) readers may have noticed there have not been any commits to the mathtex subversion repository over the past couple of days. The reason for this is because over the past couple of days (and probably for a few more) I have been working on re-integrating mathtex back into matplotlib.

The work — for those that are interested — is being done in the mathtex branch of the matplotlib subversion repository. This reintegration is especially important as although the success rate for summer of code projects is high (around 80% I believe) a surprising number of the projects never 'go gold' and make it into actual releases. Instead they live their days out in branches and separate repositories.

To ensure that mathtex does not succumb to this fate I have started integrating it back into matplotlib as soon as possible. This has several advantages: firstly it gives the API a good work-out to ensure that there are no glaring issues — reducing the chances that the API will change in the future; secondly it allows for mathtex to have a much wider userbase than it would have otherwise.

While it should not take long to port all of the existing backends over to mathtex the precise manner in which mathtex will be bundled with matplotlib is still up for debate. (But, it is almost certain that it will be bundled, resulting in no extra dependencies for matplotlib.) Once this is done — and tested — mathtex should be on much firmer ground.

Tuesday 14 July 2009

Enhancements to the Command Line Utility

Today I implemented a challenge-response style persistence mode to the command line utility. Usually, the command line utility — which ships as part of mathtex — exits as soon as it has finished rendering the equation.

However, in persistence mode, which is invoked by providing no output file it works as follows:
./mathtext
$x+y=2$
/tmp/tmpg0gkiX.mathtex.png
$xy=4$
/tmp/tmpCqTVBH.mathtex.png

Upon entering persistence mode mathtext waits for an equation on stdin; after an equation has been typed in it is rendered and the filename — generated automatically — is written to stdout. This not only makes batch processing of equations easier but also makes it a lot more convenient to use mathtex in another application.

Rather than having to spawn a new instance of ./mathtext every time you need to render an equation you instead can just launch a single instance and pipe equations to it. This is significantly faster than spawning a new instance.

The resulting equations remain in the systems temporary directory until the caller deletes them. Alternative file formats are available by using the -j flag.

Monday 13 July 2009

Testing, Testing...

Sadly this week I do not have any pretty pictures to show you.

However, for those of you still reading I would like to tell you about the unit/regression tests that are being added in Mathtex. Anyone who has been watching the commit log will know just how easy it is to break something in Mathtex. I have done this several times without even touching the rendering code.

Therefore last week I started work on a regression testing suite to ensure that layout and rendering errors can be caught before committing. There are currently around forty tests which cover most of the functionality in Mathtex. The suite runs each test several times — at different resolutions with different fonts.

Currently passes and failures are determined by hashing the generated bit-maps and seeing if they match to the reference hashes (kept in subversion). However, this is currently dependant on the version of FreeType used. (Different versions produce slightly different bit-maps for the same input.) Hence I am currently looking into fuzzy comparisons in order to account for these slight variations.

Tuesday 7 July 2009

Rendering: Now Bug-for-Bug Compatible

Over the last week or so I have been doing a couple of things. Firstly, integrating FT2Font and the font-manager from matplotlib into mathtex to remove the dependency on matplotlib and secondly fixing various rendering issues.

As of r21 mathtex is now on a par with matplotlib. What follows are some sample renderings, showcasing what is currently possible and also highlighting areas where work is needed.


This showcases a simple square root. Although the left- and top-padding it is big excessive it is perfectly acceptable.


This is Euler's formula. Notice the un-even spacing around the = sign and the less-than-usual spacing around operators.


Here we have an integral. Although the fraction looks nice there is far too little spacing between the slanted integral symbol and the fraction.


And finally we have the quadratic formula. Instead of Computer Modern (Bakoma fonts) which are usually used with TeX this equation is typeset using the STIX fonts. Apart from a minor spacing issue (the -4ac) it renders quite nicely.

While my current focus is on getting a clean library API ready once that is done I should have time to correct some of these (minor) rendering issues.

Monday 29 June 2009

x Marks the Spot

There it is — the first 'equation' ever rendered by Mathtex! Although it may look like nothing more than a 99-DPI 12pt x in italicized Computer Modern it is really something quite special — a vision of progress.

Below is the parse-tree and glyph stream generated by the program:
freddie@fluorine ~/Programming/mathtex $ python main.py
[Hlist <9.42> [Hlist <0.00> ] [Hlist <9.42> `x` k1.17] [Hlist <0.00> ]]
[(-0.5, 7.0, Bunch(symbol_name=x, metrics=Bunch(advance=9.41821289062, iceberg=7.0, ymax=7.0, height=7.0, width=8.25, slanted=True, xmax=8.6875, xmin=0.5, ymin=0.0), num=120, fontsize=12, offset=0.0, postscript_name=Cmmi10, font=, glyph=))]


Over the last couple of days I have been working on the code that I committed last week (here for those that are interested) and as promised now have something that does work.

However, there are several unpleasantness associated with it: firstly it depends on mathtex.ft2font — the FreeType wrapper used by matplotlib; secondly there is currently only a Cairo backend; thirdly the only font series supported Computer Modern, by way of the Bakoma fonts; fourthly the font paths are currently hard-coded.

I plan to fix all of these issues over the next couple of days — starting with using font metrics files as opposed to FT2Font and then writing a C-based renderer and wrapping it using Cython. I expect that this will be done by Friday.

Saturday 27 June 2009

This Week in GSoC Mathtex

Officially, according to my schedule this week should've been spent on producing a set of unit tests. Now, usually when things don't go to schedule it is because something bad or unexpected occurred.

However, while looking through the Mathtex code last week something good — but unexpected — occurred. It seems as if splitting Mathtex from Matplotlib is significantly easier than I first anticipated. Therefore, this week has been spent splitting the behemoth mathtex.py file in Matplotlib into several smaller files, ready for externalisation.

I expect that by Sunday or Monday the SVN repository will have a version of mathtex that is able to render equations using a Cairo backend and the FT2Font library from Matplotlib. Once this is working it shouldn't be to difficult to a) add a bitmap backend using FreeType/Cython/libpng; b) use font metrics files as opposed to FT2Font for metrics information.

On a personal note yesterday was also my last day in university accommodation. As of 22:30 BST I am now home again as opposed to being in central London. Yay for packing and unpacking!

Thursday 25 June 2009

We Have a Mailing List

Following on from the Mathtex project announcement last week we now also have a mailing list. mathtex-dev; http://groups.google.com/group/mathtex-dev?lnk= which is open to all. Although a development list anyone with an interest in the project, should make their voices heard. This will almost certainly become more important in the next week or so when the floor is opened to feature/enhancement requests (backends and syntax support).

Friday 19 June 2009

We Have a Project!

Today I got around to creating a Google Code project for Mathtex in preparation for the pending source code commits. The project can be found here: http://code.google.com/p/mathtex/

Don't expect the front page to say like that for long, however!

Thursday 18 June 2009

Rendering and Backends

As well as familiarising myself with the existing Mathtex code I have also been considering how to actually go about rendering parsed expressions. Matplotlib currently has a very rich collection of backends which all provide significantly more functionality than required by Mathtex — which only needs glyph setting and line drawing.

One of the outstanding issues, however, is that of FreeType. Although there are bindings for virtually every other programming language there are currently none for Python. (There are, however, two failed attempts.) As a result of this Matplotlib includes its own wrapper FT2Font which is C++ based.

In the current implementation of Mathtex all backends use this FreeType wrapper to get glyph metrics (width, height, advance, etc). Some backends then go on to use FreeType for the rendering, while others (such as PDF and SVG) do not. As glyph metrics are, for the most part, invariant I have been considering putting them in a table as opposed to reading them from the font file each time. This is similar to how TeX operates.

The immediate consequence of this would be that FreeType would not be a hard dependency — if one wishes to only produce PDF/SVG files (or use backends which use FreeType indirectly). However, it might lead to reduced rendering quality for bitmapped backends. (I am still investigating this; the reason being that FreeType provides hinted metrics, while a table would not.)

Assuming that there is no perceivable difference then it is likely that for most of the default fonts a look-up table will be produced. This will allow for separation of the parsing/rendering stages: one piece of code can parse the expression and produce a list of glyphs (at specific sizes, styles and locations) and another independent piece of code can then go onto render it. Furthermore this would make it easy for people to make use of Mathtex in their own applications, but just asking for a stream of glyphs/drawing ops and then rendering it themselves.

For the bitmapped backend (arguably the most common) it is likely to be written in C and abstracting away all of the FreeType/compositing operations — which are more natural in C than Python — and then using Cython to create a Python wrapper for it.

If both of these prove viable then they will serve the purpose of sidestepping the FreeType + Python issue entirely. Of course, I think most agree that a well maintained FreeType wrapper for Python is the way to go (and is something I will consider if I have time at the end of my project).

Otherwise my current plan is to use FT2Font from Matplotlib as a wrapper around FreeType. Expect an answer in the next couple of days :)

Friday 12 June 2009

Project Update

Sorry for the lack of updates over the last couple of weeks — however I have had my first year undergraduate exams this week and was revising the week prior. With my exams now out of the way I am able to start working full-throttle on my GSoC project.

According to my schedule (which I will post verbatim in a later post) the plan it to spend the first week getting up to speed on exactly how the TeX layout algorithm works and how it is currently implemented in Matplotlib. If all goes to plan you can expect a diagram-filled blog post on it near the end of next week.

My mentors, John Hunter and Michael Droettboom have given me some good reading material on it (along with the current implementation, of course) so this should not be too difficult.

Finally, sometime in the near future you can also expect a discussion on backends (so the code responsible for turning a list of characters and coordinates into an image/document).

Sunday 31 May 2009

Font Hinting Comparison

Although it has taken me much longer than I expected here is the font hinting comparison I promised some weeks ago. The above image is a direct comparison between the four (auto) hinting options provided by FreeType.

The most striking thing is how poor the "Slight" and "Full" hinting options look. (Although they look similar closer inspection reveals them to have some subtle differences.) "None" and "Medium" both produce acceptable results, however. The most notable difference between the two is in the infinity symbol: notice how it is much more consistent with "Medium" hinting as opposed to "None."

So why does all of this matter? Well, rendering quality is extremely important, especially when bitmapped images are being produced -- a potentially common use case for the resulting Mathtex library. Therefore it is important to know exactly how the various rendering options affect the output in order to choose sensible defaults.

About the samples: these were all produced using the svn HEAD version of matplotlib using the Cairo backend (which when running under Linux uses FreeType + fontconfig, making it easy to change the hinting method).

Saturday 16 May 2009

Is there anybody out there?

Sorry for the lack of updates over the past week or so. With exams on the horizon I am finding myself with ever less time to work on my summer of code project. Thankfully, however, I scheduled this into my initial time-line and so it should not be a problem.

I did, however, manage to find the time to work on a quick comparison between the hinting options provided by FreeType's auto-hinter and the current Mathtex implementation. Although not strictly related to my project it is interesting nevertheless. If all goes well I should have a post on it ready by Wednesday.

Monday 4 May 2009

Who are you?

Tell me, I really want to know, who are you? As I am sure many of you know from my last blog post my name is Freddie Witherden and I am a first year physics student at Imperial College London. While this is my first time participating in GSoC I have been active in the open source community for a couple of years now. In this post I'll to outline things of interest I've done heretofore — hopefully keeping the self aggrandisation to a minimum.

My first real experience as a contributor to an open source project came in 2007 when I started submitting patches to the open source real-time strategy game Warzone 2100. Later that year I rewrote the games network code and subsequently became a full fledged developer. Since then I have gone on to maintain the Mac version of the project and am slowly but surely modernising the user interface.

In around December of last year I wrote a web utility to perform error propagation calculations. Given a function, f of n linearly independent variables it would compute the error in f as a function of the errors in each of the input variables. The utility — which can be found here — was written in C++ using GiNaC for symbolic computation and jsMath for LaTeX math output. It was while designing this tool that I discovered the need for an easy way to render LaTeX math expressions without needing a full blown LaTeX install.

Around this time — while looking for alternatives to GiNaC that were more suited for web applications — I discovered Sympy. Since then I have contributed several patches to the LaTeX printing code in order to improve the output quality. But, the problem about how to render the resulting LaTeX code was still an open problem.

After Christmas my first year computing lab started which — among other things — consisted of a project. The project I chose was the double pendulum, a very simple example of a chaotic system. Already knowing C++ I went on ahead to write a graphical simulator using the Qt libraries. Released under the GPL it can be found here.

Although this post has not done a particularly good job at describing who I am I hope that some of the projects which I have had the privilege of being a part of are of some interest.

Saturday 2 May 2009

Please allow me to introduce myself

…I'm a man of wealth and taste. My name is Freddie Witherden and I am a first year Physics student at Imperial College London. This summer I have the privilege of participating in Google's Summer of Code — working on externalising the Mathtex rendering engine which exists in Matplotlib into its own library.

Mathtex is a LaTeX rendering engine written purely in Python which is able to parse and render (to various formats) most mathematical expressions. While it is currently part of Matplotlib there are several projects and applications which would could make use of and benefit from it. Therefore I am working under the supervision of the Python Software Foundation to move the code into its own library.

Over the next couple of days — as time permits — I'll post some more information about the project, myself and what's involved.