Jul 31, 2012
This week my code finally has reached a state where everybody interested can try it. After having implemented line breaking and fixing a few bugs last week I implemented the placement finder algorithm for point placements this week.
The placement finder is the component in Mapnik’s rendering stack which takes a list of glyphs and their relative positions and calculates the positions these glyphs should go to. For point placements this isn’t terribly hard. You just place them next to each other and make sure you get character spacing and rotation right. However the old code to do this job was more than 150 lines long. The new code is only about 60 lines of code and it should be much faster.
The old code calculated the bounding box for each char and checked for collision with other text. Now only the bounding box of all text together is used. For point placements the difference between the sum of all individual bounding boxes and the complete bounding box is usually very small.
Let’s have a look at some numbers: Assuming that there are already 20 text placements with an average of 10 characters each and we are trying to place another one we get this:
So the checking takes only 5% of the time it previously did.
One other advantage of the new code is that it doesn’t explicitly handle RTL text. This is done in earlier stages of the processing. The old code assumed text was either completely LTR or RTL. Rendering text with both directions could work, but didn’t always work.
I also worked on line breaking again, fixing a few bugs - some of which were interesting. For example, I noticed that when the first glyph of a string is longer than the line length my code crashed. The problem here is that it asked ICU’s break iterator for a break position before the first char and it returned if there is none. As it turns out, the return value for a non-existent break is INT_MAX-1 leading to an out-of-memory error.
I also implemented forced line breaks (via “\n” chars).
Most visual tests return similar results to the reference images, however automatic testing fails because the images aren’t identical. HarfBuzz chooses different (better) character positions than our simple approach in the old code did. The rendering quality for small font sizes has drastically improved (see also bug #1290):
If you look closely you will also notice that the character spacing is distributed much better now.
I also discovered a bug in Mapnik master: RTL text is placed a the wrong position:
Distance should be at least 16 pixels
Left: old (wrong), right: new (correct)
Currently I’m working on adding a new symbolizer type which simply draws all the collision detection bounding boxes. This should help verifying that the calculations are actually correct and no annoying bugs turn up later.
After that I will work on adding line placements to placement finder. This is not trivial but I will try to split the job into smaller parts so the code is easy to understand and bugs are less likely.