This week my code finally has reached a state where everybody interested can
try it. After having implemented line breaking and fixing a few bugs
last week I implemented
the placement finder algorithm for point placements this week.
The placement finder is the component in Mapnik’s rendering stack which takes
a list of glyphs and their relative positions and calculates the positions these
glyphs should go to.
For point placements this isn’t terribly hard. You just place them next to each
other and make sure you get character spacing and rotation right. However the
old code to do this job was more than 150 lines long. The new code is only about
60 lines of code and it should be much faster.
The old code calculated the bounding box for each char and checked for collision
with other text. Now only the bounding box of all text together is used. For point
placements the difference between the sum of all individual bounding boxes
and the complete bounding box is usually very small.
Let’s have a look at some numbers: Assuming that there are already 20 text placements with
an average of 10 characters each and we are trying to place another one
we get this:
- Number of items in collision detector: 200
- Time (in arbitrary units) for a collision detector check: ln2(200) = 7.6
- Number of checks per placement: 10
- Total time: 76
- Number of items in collision detector: 20
- Time (in arbitrary units) for a collision detector check: ln2(20) = 4.3
- Number of checks per placement: 1
- Total time: 4.3
So the checking takes only 5% of the time it previously did.
One other advantage of the new code is that it doesn’t explicitly handle
RTL text. This is done in earlier stages of the processing. The old code
assumed text was either completely LTR or RTL. Rendering text
with both directions could work, but didn’t always work.
I also worked on line breaking again, fixing a few bugs - some of which were
interesting. For example, I noticed that when the first glyph of a string
is longer than the line length my code crashed. The problem here is that it asked
ICU’s break iterator for a break position before the first char and it returned
if there is none. As it turns out, the return value for a non-existent break is
INT_MAX-1 leading to an out-of-memory error.
I also implemented forced line breaks (via “\n” chars).
Most visual tests return similar results to the reference images, however
automatic testing fails because the images aren’t identical. HarfBuzz chooses
different (better) character positions than our simple approach in the old code
did. The rendering quality for small font sizes has drastically improved (see
also bug #1290):
If you look closely you will also notice that the character spacing is
distributed much better now.
I also discovered a bug in Mapnik master: RTL text is placed a the wrong position:
Distance should be at least 16 pixels
Left: old (wrong), right: new (correct)
Currently I’m working on adding a new symbolizer type which simply draws all
the collision detection bounding boxes. This should help verifying that the
calculations are actually correct and no annoying bugs turn up later.
After that I will work on adding line placements to placement finder. This is
not trivial but I will try to split the job into smaller parts so the code
is easy to understand and bugs are less likely.