Compressing Vector Graphics

Even vector graphics can be compressed. Here I explain the principles behind my vector graphics compressor.

  • Share it.

Common image compression techniques are usually found in bitmap files such as JPEG and PNG, and allow a file to be downloaded quickly and then automatically expanded at runtime.

Vector image compression is also possible with the SVGZ format, but it requires the developer to manually handle the expansion of the data. While JPEG and PNG files are commonly handled by development frameworks as “ready to go” pictures, an SVGZ is just a compressed document.

In a recent Silverlight project I had a geographic map with over one megabyte of vector map data. Whilst Silverlight applications end up as compressed XAP files, similar to ZIP files, the vast majority of the file size was purely down to the map data. If only I could reduce the size of this aspect of the application, the whole thing would load faster.

Stacking compression methods is usually ineffective. If you’ve ever tried zipping up an MP3 or JPEG you’ve probably found that it didn’t shrink very much. This is because the method of compression is similar. Here’s an analogy:

Let’s say we want to write down the following message as succinctly as possible on a post-it note, take it home and type it up in full on our PC:

“HELLO HELLO HELLO GOODBYE GOODBYE GOODBYE”

We could write it down exactly as above and it might just fit. Instead we could write this:

“HELLO” x3 “GOODBYE” x3

Here we’ve written instructions which will reproduce the same message but used less space/characters to do it. If you could remember what those instructions mean then you could reproduce the message when you got home.

What we just did was identify some recurring patterns, and define some instructions which allow us to reconstruct the message using those patterns.

Now, what happens if we take our process of compression, and repeat it on the compressed message? Will it get any smaller? Unfortunately in this case there are no more repeating patterns, except maybe for the “LL” or the “OO”. There would be no benefit in extracting these, as the amount of instructions we’d end up writing would make the message longer rather than shorter:

(“HE” x1 “L” x2 “O” x1) x3 (“G” x1 “O” x2 “DBYE” x1) x3

So given this principle, how is it possible to make the final XAP file any smaller given that it holds all this data and is already compressed? The quick answer is to lose some data.

How can we do that? What if the message…

“HELLO HELLO HELLO GOODBYE GOODBYE GOODBYE”

Would be acceptable if it turned into this…?

“HI HI HI BYE BYE BYE”

For argument’s sake we’ll say this is an acceptable rendition of the message — whilst different, for the purpose it is fine. Here we’ve “compressed” the data by simplifying it. Let’s now run it through our other approach to compression. It would compress into this, which is shorter again:

“HI” x3 “BYE” x3

We have benefited from stacked compression by using two different approaches; one, the classic pattern matching, and one by simplifying, or shortening the data itself.

So how does this work in the real world? The key to making this work with the map is recognising the way that vector image data is stored. An individual shape, or “Path”, is represented by a string of text made of letters and numbers. The letters are instructions and the numbers are values a bit like co-ordinates.

In our case, the map had a very high level of precision and we were only using it as a rough geographic guide. Simply processing the string and rounding all the numbers to a lower precision, the length of the string shrunk. Take this example:

“10.34573, 10.2983873546, 9.6842, 3.9583726, 4.2398, 3.7645321”

When rounded, becomes:

“10, 10, 10, 4, 4, 4,”

When compressed again becomes:

“10,” x3 “4,” x3

In the context we were using it the loss of detail was perfectly acceptable, which allowed a reduction in size of around 50-60% BEFORE it was compressed into the zip. As the zip used a different form of compression altogether, the compression stacked to a certain extent. Whilst the effects aren’t purely additive there is still a measurable gain, and allowed us to reduce the size of the whole application.

The tool was developed in my own time, so I have released it as Open Source on CodePlex. It compresses XAML and SVG documents, so can help you reduce the size of your Silverlight, WPF and HTML5 projects:

http://pathcompressor.codeplex.com/

An example of the visual approximation, compressed on the right (note they’re acceptably similar despite compression):

Comparison

A visual demonstration of the reduction in data, compressed on the right (note how much smaller it is):

Code reduction