Embedding graphviz-generated scalable graphics in an OpenOffice Document

You can insert graphics into an OpenOffice Writer document – or any of a number of other document editing programs. It’s generally a simple “Insert picture/from file…” menu option.

Getting it not to look like garbage is the hard part. Most images are bitmaps and scaling bitmaps usually results in images that are either “chopped” or have lines that “stairstep”. And, of course, the text can be really ugly.

There is a class of image file that scales, and this is the structured image class of files. It includes SVG, Windows Metafile, and PostScript drawing commands as well as the (Windows-only) Visio embedded object.

Unfortunately, the geniuses determined the supported set of image files that the graphviz utility would generate and the geniuses who defined the images that OpenOffice Writer would import did not have lunch in the same building. About the only commonality is EPS, and in the Fedora standard RPM, EPS is not one of the options presently (mid-2008) included for graphviz.

Failing that, there’s a fairly easy workaround. Have graphviz output a PDF. PDF’s contain PostScript. There’s a utility named “pdftops” that can extract that PostScript and wrap it up in an EPS wrapper. Note that this isn’t the same program as “pdf2ps”!

Example:

dot -oBackingBeans.pdf -v -Tpdf BackingBeans.dot
pdf2ps -eps BackingBeans.pdf BackingBeans.eps

It’s probably pipeable, but this will do.

One other thing remains. The default font selection on my system wasn’t all that beautiful, so I overrode it:

graph[fontpath="/usr/share/fonts/liberation", fontsize=12, fontname="LiberationSans-Regular"];

Here, too, I’ve done the “brute force” thing and supplied the fontpath internally and manually. I should actually set up the environment, but I had other priorities that day and custom fonts are reputedly a pain in graphviz anyway. The “-v” option on graphviz lets you see what font is actually getting pulled, by the way.

ORM and Imprecise Data Types

JPA is great in a lot of ways. But there’s one problem a lot of people have with JPA and ORM in general – imprecise keys.

Technically, this is as much a JDBC problem as an ORM problem. It just seems to cause more problems in ORM. Perhaps because ORM lets you concentrate less on the raw data details.

Anyone who has taken a basic programming course has (hopefully) had it pointed out to them that it’s dangerous to compare for exact equality on floating-point numbers. Most binary representations of floating point are imprecise on fractions, and even a simple 0.1 has no simple binary value.

Less obvious, however, is that times and dates are also imprecise formats. Technically, not dates, but in the real world, dates and times are often intermingled even when you don’t expect them to be.

There are 3 primary time/date representations commonly found in Java:

java.sql.Date

java.sql.Date

DBMS-specific dates.

I won’t address high-precision time data types. They’re less likely to surprise people.

java.sql.Date is, in theory granular to one day. In practice, it’s a subclass of java.util.date, so that’s not literally true.

java.util.Date is granular to one millisecond. Since java.sql.Date doesn’t enforce granularity, you can get in trouble with java.sql.Dates which have time-of-date day in them. Especially when dealing with Calendar timezone conversions.

Things get even more interesting when these objects are used in ORM and persisted out to a database. Oracle Dates have a granularity of 1 second – there’s no standard Java time class that reflects this. So if you persist out a java Data object, it may – generally will – get silently truncated. Thus your in-memory date and your database dates will not compare equal!

As bad as this is, it’s worse if you try and use that date as a primary or foreign key. You’ll get invalid results, since there’s no such actual value in the database. More insidiously, in an ORM environment, you can make queries without realizing it. That is, if you retrieve an object that’s liked to another object and the actual linkage was a date, the linkage may fail for non-obvious reasons.

There’s no easy fix for this. You can write your own Date class that enforces the granularity of your choice (truncates/rounds to seconds in the case of Oracle), but then you have to configure the data type mapping in your ORM configuration. You can write accessor functions to do the same thing, but if you forget to use one, the program will fail for non-obvious reasons. It’s best to avoid using dates as keys, but this isn’t always an option, and even non-key dates have their perils.

Oracle isn’t the only – or perhaps even the worst – offender. It turns out that PostgreSQL has an even more insidious date problem. By default, the Oracle time data types are floating-point. Which means that they’re imprecise. And if dates are bad as keys, floating-point numbers are a thousand times worse!

There are 2 possible ways to handle that. One is to build a custom copy of the PostgreSQL server using the option of internally storing fractional seconds in interger form. Probably not going to happen, since not only does this mean you have to have permission to run a non-standard server, but also the internal table data won’t be freely interchangable with standard-build tables. This would be an even worse problem than it is, except that PostgreSQL is notorious for changing internal structure even between minor releases.

The other alternative is to define the time value with fractional seconds truncated – for example, as TIMESTAMP(0). It is, unfortunately, not possible to accurately represent millisecond values in a timedate value on a stock PostgreSQL server, so the next best thing is to simply hack them off if you intend to retrieve by time or date.

Detached objects and JSF

JSF and JPA have proven to be more problematic than expected. The upside of the JSF framework is that the datamodel objects can be presented more or less right up to the page view layer without recourse to Data Transer Objects (DTOs). The downside, is that what’s presented isn’t always clear, leading to the dreaded OptimisticLockingException.

Officially, you get an Optimistic Locking Exception when your in-memory model is out of sync with the database (the database is more up-to-date than the model). In actuality, all that really needs to happen is for the ORM manager to think the model is out of sync with the database.

This perception seems to be distressingly easy cause. I’d blame it on a bug in OpenJPA, but I had similar issues with the detach/attach model of JDO. I’ve not seem any good writeups on how to prevent the problem, but I’ve come up with a means of handling it (right or wrong).

In Struts, the issue was more obvious. It may be the same mechanism in JSF, just more obscure.

Here’s what happens:

1. You fetch in a record, display it, get the user’s input back.

2. You merge the update with the database. OB so far.

By default, JSF will redisplay the same form. If you then do more changes and attempt to merge them, you get an OptimisticLockingException.

In theory, this shouldn’t be happening, but in the current and development releases of OpenJPA (up to 1.2.0), I can do only one merge.

As it turns out, the simple solution is to do the update, then fetch a brand-new copy of the object if further edits are required/anticipated. That makes it almost a return to the old-time concept of EJB where a handle could be used to re-activate a copy of the EJB. Only instead of a handle, use the object’s primary key. This should be low-overhead assuming the object’s still in cache, but it is irritating.

As to what’s actually causing the problem, it seems that the “dirty” flags for the object don’t get cleared when the merge is done and the EntityManager merge() method returns the object. So on the next merge, the database is updated, but the original pre-merge criteria were applied.