Writing files into a WAR – Another BAD IDEA

I’ve always recommended against this. For one thing, a WAR is a ZIP file and Java has no builtin support for updating ZIP files.

A lot of people abuse the fact that many JEE servers unpack (explode) WARs into a directory such as Tomcat’s webapps directory. They then proceed to use the ServletContext getRealPath() method to translate a subdirectory in the WAR into an absolute filename path.

There are 4 problems with that idea.

  1. If the server doesn’t explode the WAR, there won’t be a real path. So the pathname returned will be null and the code will probably throw an exception. This can be a problem when transporting the application to a different vendor’s server or when the configuration of the current server is changed.
  2. It’s generally good practice to keep executable code, forms, and other potentially-hackable constructs in a write-protected location. Plop down a writable directory in the middle of the WAR and you’ve opened up a potential exploit.
  3. If you write “permanent” files to the WAR directory, a redeployment may nuke the entire WAR substructure, losing the files forever. I’ve always preferred to explicitly erase a WAR before updating it anyway, since otherwise old stale stuff hangs around and pollutes the application. Sometimes with unfortunate results.
  4. If you hardcode the write directory relative to the WAR, what do you do when the disk fills up? Unix and Linux provide a special directory tree (/var) to hold things that may grow, but there’s no fixed relationship between the WAR directory and the /var directory. Coding an absolute path can work, but it’s not very flexible, either.

What to do? I normally get my writable directory location via JNDI lookup. For example: “java:comp/env/wardata”. The advantage of this is that I can relocate the directory any time I want. I put the default location in the web.xml resource definitions, but in Tomcat I can override this. Which is convenient when testing.

JSF/Facelets/RichFaces – and Maven

SF itself is fairly straightforward. Getting it functional in an appserver is another matter. Originally, I used MyFaces and Tomahawk. More recently, I’ve replaced MyFaces with the Sun JSF Reference Implementation (RI). Tomahawk, although a MyFaces library works just fine with the RI.

The JSF-impl jar is part of the server for JEE-compliant servers, such as recent versions of JBoss. For Tomcat, it has to be explicitly linked into the WAR (?). At any rate, it has to be put into the application’s classpath, and everyone seems to be putting it into the WAR and not the server lib directory. Since there are possible threading implications, I’m doing likewise.

Tomcat5 also needs the EL-ri JAR placed in its classpath. Tomcat6 includes the required classes as part of the base distribution.

For the whole set of dependencies, see the Maven POM for my sandbox project: here

OpenJPA/Spring/Tomcat6

Oh, what a tangled web we weave…

In theory, using JPA and Spring is supposed to make magical things happen that will make me more productive and allow me to accomplish wonderful things.

Someday. At the moment, I gain tons of productivity only to waste it when deployment time comes and I have to fight the variations in servers.

JPA allows coding apps using POJOs for data objects. You can then designate their persistence via external XML files or using Java Annotations. The Spring Framework handles a lot of the “grunt” work in terms of abstract connection to the data source, error handling and so forth.

But that, alas, is just the beginning.

First and foremost, I had to build and run using Java 1.5. OpenJPA 1.2 doesn’t support Java 6.

Tomcat is not a full J2EE stack. To serve up JPA in Tomcat requires a JPA service – I used Hibernate-entitymanager.

JPA requires a little help. Specifically, I used the InstrumentationLoadTimeWeaver to provide the services needed to process the annotations.

The weaver itself requires help. And to enable the weaver in Tomcat, I needed the spring-agent.

To the Tomcat6 lib directory I added:

  • spring-tomcat-weaver jar
  • spring-agent jar

But that’s not enough! The agent won’t turn itself on automatically. So I need to add a “-javaagent” to Tomcat’s startup. The easiest way to do that was to create a CATALINA_BASE/bin/setenv.sh file:

#!/bin/sh
CATALINA_BASE=/usr/local/apache-tomcat-6.0.18
JAVA_OPTS=”-javaagent:$CATALINA_BASE/lib/spring-agent-2.5.4.jar”

Tomcat 5

I think this all works more or less the same in Tomcat5, except that there are 3 library directories instead of the one library that Tomcat6 uses so the location for the spring suport jars is different. common/lib seems to work, although I’m not sure it’s the best choice.

That’s half the battle. Next up: JSF/RichFaces – and Maven

Why Do-it-Yourself Java Security is a BAD THING

One of my greatest peeves in working with Java in the Enterprise is the fact that everyone+dog seems to think they can do a better job on application security than the Java architects.

OK, so there’s some really awful stuff that’s part of the Java standards, but nevertheless, rolling your own security is still a BAD THING.

Here’s why:

  1. I’m so clever. You’re not as clever as you think you are. Even I’m not as clever as I think I am, and I, of course, am much cleverer than anyone else. Most DIY security systems I’ve encountered (or, alas, developed) have proven to contain at least one easily-findable hole that you could channel the Mississippi River through.

  2. Infrastructure lock-in. Your DIY system is almost certainly tied to your current infrastructure. If the infrastructure changes, you’ll probably have to recode (see below). And, of course, if you ever had any dreams of selling your work to other shops, they probably aren’t set up the same way.

  3. Documentation and procedures. You can’t go down to the local bookstore and buy a book on how to properly use a custom security system the way you can with the standard security system. In fact, any documentation you have is likely to be insufficient and out of date.

  4. Professional Design. The standard security frameworks were designed by security professionals. Yes, I know they’re not as clever as you. But they were trained specifically to work on security first and foremost and argue with other people looking for bulletproof general-purpose solutions, then the systems were exposed to legions of evil people for stress testing. If your primary job was security, if you have extensive training in mathematical cryptanalysis, your cleverness would outweigh the fact that these people are but pale shadows of your genius. But your primary job was to develop an application and most likely, the orders weren’t to make it all perfect, just to “Git-‘R-Dun!”.

  5. Declarative security. The standard Java security systems are mostly declarative. Studies have shown that declarative is less likely to contain unexpected bugs. You can write anything in code, but when all you have is fill-in-the-blank declarative options, your opportunities to make mistakes are far fewer.

  6. Minimal coding. The standard Java security systems are generally minimally invasive. You don’t have to turn the application code upside down every time the security infrastructure changes. You can test most application code with security switched off or using a local security alternative like the tomcat-users.xml file without having to establish some sort of heavyweight alternative to the production security system (or petition the security administrator for test security accounts and privileges).

  7. Maintenance costs. When you tie security code intimately into application code, anyone coming along later to do maintenance will probably be ignorant of the nuances (remember the local bookstore? and will end up punching a hole in the security. Security is like multi-threading and interrupt handling. It only takes ONE bug to bring everything down.

  8. Development costs. When you tie security code intimately into application code, you have to do twice the work, since you have to code both the business logic and the security logic. And, again, forget to do it in just one place and the whole thing turns into tissue paper. Additionally, in-application security code is not just another spanner in the works of setting up testing and debugging frameworks, you have to debug both the security code and the application code.

  9. Framework support. Standard frameworks like Struts and JSF have built-in support for the J2EE standard security system. They don’t have built-in support for one-off security. You’re paying for it regardless of whether you use it or not. You might as well reap the benefits.

  10. Mutability. If you want to rework a webapp into a portlet or web service, the standard J2EE security system will mostly port transparently, since it’s minimally invasive. Portlets virtually demand a Single Signon solution – forcing someone to enter login credentials into each and every portlet pane will not win you friends.

Ideology.

Sounds a lot like “idiot”. And with good reason. There is no such thing as a one-size-fits-all Silver Bullet solution. Sometimes the standard security framework isn’t a good fit. About 9 times out of 10, it is, however. For the 10th case, I generally prefer to augment the standard security framework. For example, when I need fine-grained security, I use role-based access control to fence off the major sections of the app, then use the identity provided by the standard framework to retrieve the fine-grained options. Where that doesn’t work or is insufficient, I try to use minimally-invasive approaches, such as Filters and AOP crosscuts. I’ll do anything it takes, in fact, but the more I can work within the supported system, the happier I am.

Why Software Projects Should Not Depend on an IDE

disclaimer: I’m about to show my age.

My first “IDE” was an IBM 029 keypunch machine. I keyed programs in FORTRAN, COBOL, Assembler and PL/1. When I messed up, I threw away the defective cards and punched new ones. After waiting in line for a free keypunch machine.

Over the years, things got better. I graduated to online terminals, then added debuggers, code assists and refactoring.

But I also learned something. IDEs are more fluid than programming languages. As projects got more complex, they took on more and more of the responsibilities of maintaining not only program code, but the entire build process. Including tracking the locations of the build components and running the associated build utilities such as resource compilers.

Eventually it reached a point where the IDE was less an aid than an addiction. A certain IDE which Shall Remain Nameless denegrated the ability to build from the command line using constructs like batch scripts and makefiles into virtual uselessness. It was too much trouble to learn how to create a build from the command prompt.

Then, the New, Improved version of the IDE came along. But guess what? The old projects had to be modified to build under the new IDE. Then came the late-night call for an emergency code fix. The project in question hadn’t been touched in 2 years. It wouldn’t compile without the old IDE, even though the actual fix was a 1-line code change. Installing the old IDE required installing old support programs. Taken to extremes, it would have ended up with taking an old junk computer out of the closet (at 2 a.m.), installing an old OS, installing the old IDE and supporting cast, all to resolve a minor problem.

But wait – there’s more!

Years passed, I moved onto other platforms. But now I worked in a shop where there were 2 different groups of developers. The other group had developed an entire ecology of their own – all tied to their IDEs and desktop configurations. We were supposed to be sharing standards. But their ecology had been designed specifically to to their needs, not ours. They could hand us code, but it wouldn’t build because we hadn’t invested our desktops in a lot of configuration that was scattered around inside of people’s IDEs.

Contrast this with a batch-based process such as Maven or Ant.

Maven, of course, tends to force a consistent organization, for better or worse. So while the choice of goals may be unclear, at least the build is consistent.

Ant is less standardized, but it’s no big deal to make an Ant script self-descriptive.

And in both cases, you can place project/platform-specific configuration in files that can be passed to build processes and stored in the project source code archives. Instead of being embedded on people’s desktops.

When an IDE Just Won’t Serve

There are cases where an IDE simply cannot do the trick. It’s not uncommon these days – especially in Agile development shops – to publich a Nightly Build. The Nightly Build is typically a collation of the daily commits done after hours in batch. “In Batch” and “IDE” don’t go together. An IDE is an interactive environment. Furthermore, the batch build machine may be a server, not a desktop machine. Not all servers have GUIs installed – or even Windowing systems. Ant and Maven won’t have a problem with that, but IDEs will.

JBoss and OpenJPA

I developed my Technology Sandbox app using OpenJPA under Tomcat6. Since JBoss 4.2 has its own persistency mechanisms (EJB3 and Hibernate), it looked like some changes might be required to port it.

It turns out that very little needed to be done.

The most important thing is to get a copy of the Spring agent (in my case spring-agent-2.5.4.jar) and copy it to the JBoss lib directory so that when it launches Tomcat, it can be used.

To get the embedded Tomcat to start using the Spring agent, a

JAVA_OPTS=-java-agent:/absolute/path/to/lib/spring-agent-2.5.4.jar

needs to be defined in the JBoss/bin/run.conf file. If one doesn’t exist, create it. If it does and there’s already JAVA_OPTS defined (such as -Xmx options), just add the -java-agent option to the existing set of JAVA_OPTS.

The only other thing that was required was to define a JDBC data source in the JBoss deploy directory to make the connection between the database connection parameters and the JDNI name of the data source. I just copied and modified one of the examplex-ds.xml files.

JBoss provides the JSF core implementation itself and that collides with MyFaces. The Technology Sandbox app started out using MyFaces. Since it’s actually getting the JSF stuff for the WAR build via Maven, a simple change to the POM should fix that issue, but a quick fix is a web.xml option that turns off the container JSF implementation:

    <context-param>
        <param-name>org.jboss.jbossfaces.WAR_BUNDLES_JSF_IMPL</param-name>
        <param-value>true</param-value>
    </context-param>

Embedding graphviz-generated scalable graphics in an OpenOffice Document

You can insert graphics into an OpenOffice Writer document – or any of a number of other document editing programs. It’s generally a simple “Insert picture/from file…” menu option.

Getting it not to look like garbage is the hard part. Most images are bitmaps and scaling bitmaps usually results in images that are either “chopped” or have lines that “stairstep”. And, of course, the text can be really ugly.

There is a class of image file that scales, and this is the structured image class of files. It includes SVG, Windows Metafile, and PostScript drawing commands as well as the (Windows-only) Visio embedded object.

Unfortunately, the geniuses determined the supported set of image files that the graphviz utility would generate and the geniuses who defined the images that OpenOffice Writer would import did not have lunch in the same building. About the only commonality is EPS, and in the Fedora standard RPM, EPS is not one of the options presently (mid-2008) included for graphviz.

Failing that, there’s a fairly easy workaround. Have graphviz output a PDF. PDF’s contain PostScript. There’s a utility named “pdftops” that can extract that PostScript and wrap it up in an EPS wrapper. Note that this isn’t the same program as “pdf2ps”!

Example:

dot -oBackingBeans.pdf -v -Tpdf BackingBeans.dot
pdf2ps -eps BackingBeans.pdf BackingBeans.eps

It’s probably pipeable, but this will do.

One other thing remains. The default font selection on my system wasn’t all that beautiful, so I overrode it:

graph[fontpath="/usr/share/fonts/liberation", fontsize=12, fontname="LiberationSans-Regular"];

Here, too, I’ve done the “brute force” thing and supplied the fontpath internally and manually. I should actually set up the environment, but I had other priorities that day and custom fonts are reputedly a pain in graphviz anyway. The “-v” option on graphviz lets you see what font is actually getting pulled, by the way.

ORM and Imprecise Data Types

JPA is great in a lot of ways. But there’s one problem a lot of people have with JPA and ORM in general – imprecise keys.

Technically, this is as much a JDBC problem as an ORM problem. It just seems to cause more problems in ORM. Perhaps because ORM lets you concentrate less on the raw data details.

Anyone who has taken a basic programming course has (hopefully) had it pointed out to them that it’s dangerous to compare for exact equality on floating-point numbers. Most binary representations of floating point are imprecise on fractions, and even a simple 0.1 has no simple binary value.

Less obvious, however, is that times and dates are also imprecise formats. Technically, not dates, but in the real world, dates and times are often intermingled even when you don’t expect them to be.

There are 3 primary time/date representations commonly found in Java:

java.sql.Date

java.sql.Date

DBMS-specific dates.

I won’t address high-precision time data types. They’re less likely to surprise people.

java.sql.Date is, in theory granular to one day. In practice, it’s a subclass of java.util.date, so that’s not literally true.

java.util.Date is granular to one millisecond. Since java.sql.Date doesn’t enforce granularity, you can get in trouble with java.sql.Dates which have time-of-date day in them. Especially when dealing with Calendar timezone conversions.

Things get even more interesting when these objects are used in ORM and persisted out to a database. Oracle Dates have a granularity of 1 second – there’s no standard Java time class that reflects this. So if you persist out a java Data object, it may – generally will – get silently truncated. Thus your in-memory date and your database dates will not compare equal!

As bad as this is, it’s worse if you try and use that date as a primary or foreign key. You’ll get invalid results, since there’s no such actual value in the database. More insidiously, in an ORM environment, you can make queries without realizing it. That is, if you retrieve an object that’s liked to another object and the actual linkage was a date, the linkage may fail for non-obvious reasons.

There’s no easy fix for this. You can write your own Date class that enforces the granularity of your choice (truncates/rounds to seconds in the case of Oracle), but then you have to configure the data type mapping in your ORM configuration. You can write accessor functions to do the same thing, but if you forget to use one, the program will fail for non-obvious reasons. It’s best to avoid using dates as keys, but this isn’t always an option, and even non-key dates have their perils.

Oracle isn’t the only – or perhaps even the worst – offender. It turns out that PostgreSQL has an even more insidious date problem. By default, the Oracle time data types are floating-point. Which means that they’re imprecise. And if dates are bad as keys, floating-point numbers are a thousand times worse!

There are 2 possible ways to handle that. One is to build a custom copy of the PostgreSQL server using the option of internally storing fractional seconds in interger form. Probably not going to happen, since not only does this mean you have to have permission to run a non-standard server, but also the internal table data won’t be freely interchangable with standard-build tables. This would be an even worse problem than it is, except that PostgreSQL is notorious for changing internal structure even between minor releases.

The other alternative is to define the time value with fractional seconds truncated – for example, as TIMESTAMP(0). It is, unfortunately, not possible to accurately represent millisecond values in a timedate value on a stock PostgreSQL server, so the next best thing is to simply hack them off if you intend to retrieve by time or date.

Detached objects and JSF

JSF and JPA have proven to be more problematic than expected. The upside of the JSF framework is that the datamodel objects can be presented more or less right up to the page view layer without recourse to Data Transer Objects (DTOs). The downside, is that what’s presented isn’t always clear, leading to the dreaded OptimisticLockingException.

Officially, you get an Optimistic Locking Exception when your in-memory model is out of sync with the database (the database is more up-to-date than the model). In actuality, all that really needs to happen is for the ORM manager to think the model is out of sync with the database.

This perception seems to be distressingly easy cause. I’d blame it on a bug in OpenJPA, but I had similar issues with the detach/attach model of JDO. I’ve not seem any good writeups on how to prevent the problem, but I’ve come up with a means of handling it (right or wrong).

In Struts, the issue was more obvious. It may be the same mechanism in JSF, just more obscure.

Here’s what happens:

1. You fetch in a record, display it, get the user’s input back.

2. You merge the update with the database. OB so far.

By default, JSF will redisplay the same form. If you then do more changes and attempt to merge them, you get an OptimisticLockingException.

In theory, this shouldn’t be happening, but in the current and development releases of OpenJPA (up to 1.2.0), I can do only one merge.

As it turns out, the simple solution is to do the update, then fetch a brand-new copy of the object if further edits are required/anticipated. That makes it almost a return to the old-time concept of EJB where a handle could be used to re-activate a copy of the EJB. Only instead of a handle, use the object’s primary key. This should be low-overhead assuming the object’s still in cache, but it is irritating.

As to what’s actually causing the problem, it seems that the “dirty” flags for the object don’t get cleared when the merge is done and the EntityManager merge() method returns the object. So on the next merge, the database is updated, but the original pre-merge criteria were applied.