GSoC Mathtex: June 2009

Monday, 29 June 2009

x Marks the Spot

There it is — the first 'equation' ever rendered by Mathtex! Although it may look like nothing more than a 99-DPI 12pt x in italicized Computer Modern it is really something quite special — a vision of progress.

Below is the parse-tree and glyph stream generated by the program:
freddie@fluorine ~/Programming/mathtex $ python main.py
[Hlist <9.42> [Hlist <0.00> ] [Hlist <9.42> `x` k1.17] [Hlist <0.00> ]]
[(-0.5, 7.0, Bunch(symbol_name=x, metrics=Bunch(advance=9.41821289062, iceberg=7.0, ymax=7.0, height=7.0, width=8.25, slanted=True, xmax=8.6875, xmin=0.5, ymin=0.0), num=120, fontsize=12, offset=0.0, postscript_name=Cmmi10, font=, glyph=))]

Over the last couple of days I have been working on the code that I committed last week (here for those that are interested) and as promised now have something that does work.

However, there are several unpleasantness associated with it: firstly it depends on mathtex.ft2font — the FreeType wrapper used by matplotlib; secondly there is currently only a Cairo backend; thirdly the only font series supported Computer Modern, by way of the Bakoma fonts; fourthly the font paths are currently hard-coded.

I plan to fix all of these issues over the next couple of days — starting with using font metrics files as opposed to FT2Font and then writing a C-based renderer and wrapping it using Cython. I expect that this will be done by Friday.

Saturday, 27 June 2009

This Week in GSoC Mathtex

Officially, according to my schedule this week should've been spent on producing a set of unit tests. Now, usually when things don't go to schedule it is because something bad or unexpected occurred.

However, while looking through the Mathtex code last week something good — but unexpected — occurred. It seems as if splitting Mathtex from Matplotlib is significantly easier than I first anticipated. Therefore, this week has been spent splitting the behemoth mathtex.py file in Matplotlib into several smaller files, ready for externalisation.

I expect that by Sunday or Monday the SVN repository will have a version of mathtex that is able to render equations using a Cairo backend and the FT2Font library from Matplotlib. Once this is working it shouldn't be to difficult to a) add a bitmap backend using FreeType/Cython/libpng; b) use font metrics files as opposed to FT2Font for metrics information.

On a personal note yesterday was also my last day in university accommodation. As of 22:30 BST I am now home again as opposed to being in central London. Yay for packing and unpacking!

Thursday, 25 June 2009

We Have a Mailing List

Following on from the Mathtex project announcement last week we now also have a mailing list. mathtex-dev; http://groups.google.com/group/mathtex-dev?lnk= which is open to all. Although a development list anyone with an interest in the project, should make their voices heard. This will almost certainly become more important in the next week or so when the floor is opened to feature/enhancement requests (backends and syntax support).

Friday, 19 June 2009

We Have a Project!

Today I got around to creating a Google Code project for Mathtex in preparation for the pending source code commits. The project can be found here: http://code.google.com/p/mathtex/

Don't expect the front page to say like that for long, however!

Thursday, 18 June 2009

Rendering and Backends

As well as familiarising myself with the existing Mathtex code I have also been considering how to actually go about rendering parsed expressions. Matplotlib currently has a very rich collection of backends which all provide significantly more functionality than required by Mathtex — which only needs glyph setting and line drawing.

One of the outstanding issues, however, is that of FreeType. Although there are bindings for virtually every other programming language there are currently none for Python. (There are, however, two failed attempts.) As a result of this Matplotlib includes its own wrapper FT2Font which is C++ based.

In the current implementation of Mathtex all backends use this FreeType wrapper to get glyph metrics (width, height, advance, etc). Some backends then go on to use FreeType for the rendering, while others (such as PDF and SVG) do not. As glyph metrics are, for the most part, invariant I have been considering putting them in a table as opposed to reading them from the font file each time. This is similar to how TeX operates.

The immediate consequence of this would be that FreeType would not be a hard dependency — if one wishes to only produce PDF/SVG files (or use backends which use FreeType indirectly). However, it might lead to reduced rendering quality for bitmapped backends. (I am still investigating this; the reason being that FreeType provides hinted metrics, while a table would not.)

Assuming that there is no perceivable difference then it is likely that for most of the default fonts a look-up table will be produced. This will allow for separation of the parsing/rendering stages: one piece of code can parse the expression and produce a list of glyphs (at specific sizes, styles and locations) and another independent piece of code can then go onto render it. Furthermore this would make it easy for people to make use of Mathtex in their own applications, but just asking for a stream of glyphs/drawing ops and then rendering it themselves.

For the bitmapped backend (arguably the most common) it is likely to be written in C and abstracting away all of the FreeType/compositing operations — which are more natural in C than Python — and then using Cython to create a Python wrapper for it.

If both of these prove viable then they will serve the purpose of sidestepping the FreeType + Python issue entirely. Of course, I think most agree that a well maintained FreeType wrapper for Python is the way to go (and is something I will consider if I have time at the end of my project).

Otherwise my current plan is to use FT2Font from Matplotlib as a wrapper around FreeType. Expect an answer in the next couple of days :)

Friday, 12 June 2009

Project Update

Sorry for the lack of updates over the last couple of weeks — however I have had my first year undergraduate exams this week and was revising the week prior. With my exams now out of the way I am able to start working full-throttle on my GSoC project.

According to my schedule (which I will post verbatim in a later post) the plan it to spend the first week getting up to speed on exactly how the TeX layout algorithm works and how it is currently implemented in Matplotlib. If all goes to plan you can expect a diagram-filled blog post on it near the end of next week.

My mentors, John Hunter and Michael Droettboom have given me some good reading material on it (along with the current implementation, of course) so this should not be too difficult.

Finally, sometime in the near future you can also expect a discussion on backends (so the code responsible for turning a list of characters and coordinates into an image/document).

GSoC Mathtex