Land Mines – Spring Neo4j

One of the primary purposes of this blog is to record what I’ve learned by tedious trial and error and/or spending time down in source code I shouldn’t have had to look at.

This particular topic has more than its share of discoveries.

Spring Neo4j claims that it’s intended to imitate, where possible, existing persistence systems approaches. Unfortunately, it has a long way to go on that.

First, let me mention that after descending through Maven’s equivalent of “DLL Hell”, I have been working with the following version sets:

  • Spring Framework version 4.0.6
  • Spring Neo4j 3.3.0.Release
  • Neo4j version 2.2.0
  • Logging courtesy of slf4j version 1.7.6 and log4j2 version 2.2, which has renamed the log4j config files since log4j v1 and added a few new config options (just for info).

As usual, just finding what was compatible with what was an adventure. I have a nasty, if unprovable suspicion that some of the pain could have been reduced if certain functions and classes had been marked “deprecated” instead of being removed or relocated.

Some of what I’ve learned will break shortly. Spring Neo4j 4 will actually lose a number of functions and annotations that exist in Spring Neo4j 3 (they were probably broken anyway). And some of the fixes I’ve come up with may actually be exploiting bugs rather than clean fixes. It’s the best I could do.

My use case seemed simple. I have 2 entity types: A and B, and a relationship, we’ll call “wants” which has a property named “priority”. I want to set up a network of many-to-many A-wants-B relationships with associated priorities and I want to be able to select a given B and get back an ordered list of A-wants in priority order with the priorities visible.

So using Spring Data Neo4j (SDN, for short), that gives me 2 NodeEntity classes (A and B) and a RelationshipEntity (“Wants”). I’m using GraphRepository extensions to manage persistence for A and B, which gives rise to the first question:

Q: What’s the best way to handle persistence of RelationshipEntities?

There’s a RelationshipGraphRepository, but unlike GraphRepository, it’s a class, not an interface and there is absolutely zero documentation I can find on how to use it or even if I should be explicitly using it. I therefore used ordinary GraphNode for the relationshipEntity. It seems to work. More or less.

And the first observation:

When they say “@Fetch”, they mean “@Fetch”

In JPA, a recommended way of persisting an object is in the form “a = repository.save(a);” This is because JPA may construct new new “a” using the original “a” as a basis, but carrying (visibly or not) extra data and/or meta-data provided by the “save” operation.

This is also the (explicitly) recommended practice for SDN. But there’s a big “gotcha”.

When you persist with JPA, the returned object will contain AT LEAST as much data as the original copy did (it may even be the original copy). When you persist with SDN, the returned object may not. What SDN does is save the data, then do a basic lazy fetch resulting in a new object instance. Which means that unless you annotate your complex properties with “@Fetch”, you’ll end up with a nasty surprise. Unlike JPA, which returns either the original value or an unresolved proxy object, SDN returns null for unresolved lazy fetches. Which is probably a recipe for lost data and is definitely a recipe for confusion (remember, this is things I learned the hard way!)

Ouch!

You can code @Query on both Repository and EntityNode classes

The manual doesn’t mention this. It’s a useful thing to know, although since in EntityNode classes, you often want to base your query relative to the current EntityNode instance, I also spent a lot of time coding things like “{self}” and “{this}” without success. I finally found out that using the clause START node=({self}) does the trick, although since “START” is supposed to be deprecated, there’s likely something that can do the same thing with a simple MATCH. That’s something to tackle another day.

What you get back isn’t what you think it is

Cypher isn’t as intuitively obvious to me as some people seem to think it should be. It’s a form of complex mathematical notation and while the docs on it are fairly illustrative, they are perhaps less complete or explanatory as they might be. So I spent a lot of time trying weird expressions. Here’s what worked as a member query on my “B” node:

@RelatedToVia
List wantsList = new ArrayList(5);

@Query("START b=node({self}) MATCH (b)-[r:WANTS]-(a:A) return r ORDER BY r.priority")
public List getWants() {
    return wantsList;
}

A lot of the earlier attempts returned false “Wants” objects whose nodeIds were actually the IDs of the “A” objects on the other end of the relationship.

Incidentally, Neo4j normally wants relationships to be unordered and will complain if you attempt to use an ordered collection (such as List) to hold them. However it is smart enough to realize that when you do an “ordered by” query it’s OK to define the collection using an ordering collection object.

But that’s not all!

The “Wants” list that this particular query returned is not directly usable. The nodeIds are valid, but all the other property values were blank. Not merely the lazy-fetch values, but even the primitive property values. So to be actually usable I have to pull the nodeId from the returns pseudo-Wants and use the repository to lookup the actual Wants node.

So I’ve solved what should have been a relatively simple problem – even though it’s an ugly solution – and only lost about 2 days doing so.

OpenJPA/Spring/Tomcat6

Oh, what a tangled web we weave…

In theory, using JPA and Spring is supposed to make magical things happen that will make me more productive and allow me to accomplish wonderful things.

Someday. At the moment, I gain tons of productivity only to waste it when deployment time comes and I have to fight the variations in servers.

JPA allows coding apps using POJOs for data objects. You can then designate their persistence via external XML files or using Java Annotations. The Spring Framework handles a lot of the “grunt” work in terms of abstract connection to the data source, error handling and so forth.

But that, alas, is just the beginning.

First and foremost, I had to build and run using Java 1.5. OpenJPA 1.2 doesn’t support Java 6.

Tomcat is not a full J2EE stack. To serve up JPA in Tomcat requires a JPA service – I used Hibernate-entitymanager.

JPA requires a little help. Specifically, I used the InstrumentationLoadTimeWeaver to provide the services needed to process the annotations.

The weaver itself requires help. And to enable the weaver in Tomcat, I needed the spring-agent.

To the Tomcat6 lib directory I added:

  • spring-tomcat-weaver jar
  • spring-agent jar

But that’s not enough! The agent won’t turn itself on automatically. So I need to add a “-javaagent” to Tomcat’s startup. The easiest way to do that was to create a CATALINA_BASE/bin/setenv.sh file:

#!/bin/sh
CATALINA_BASE=/usr/local/apache-tomcat-6.0.18
JAVA_OPTS=”-javaagent:$CATALINA_BASE/lib/spring-agent-2.5.4.jar”

Tomcat 5

I think this all works more or less the same in Tomcat5, except that there are 3 library directories instead of the one library that Tomcat6 uses so the location for the spring suport jars is different. common/lib seems to work, although I’m not sure it’s the best choice.

That’s half the battle. Next up: JSF/RichFaces – and Maven

OpenJPA, Tomcat6, Spring and Hibernate

Getting all of the above to play together can be tricky. I knocked myself offline for nearly 2 days after the latest upgrade.

Broken Tomcat/Eclipse

The sysdeo Tomcat plugin no longer works when my testapp is installed – the log says the WebAppClassLoader (todo: check name) cannot be found. Something’s messed up the core Tomcat classpath. Fortunately the stand-alone remote debug option still works.

Annotations and Weaving

The test app is using Java 5 annotations on the persistent classes. Persistent classes are “magic”. Before they can actually be used, they have to be re-written to add the additional properties that aid the persistency manager. In the Old Days, that would have meant using some sort of utility that added additional source text to the persistent classes before they were compiled. In our new, more transparent era, this means that after the java source is compiled into classes, the extra embellishments are added to the binary class definition.

There’s 2 ways to do that:

  1. Permanently modify the class binary file
  2. Piggyback the embellishments onto the class as it’s being loaded into the JVM.

When I used JDO – and in early OpenJPA, option 1 was used courtesy of a utility known as the enhancer.

Since about October 2007 (give or take), OpenJPA has had the ability to do on-the-fly enhancement, which is option 2 or some facsimile thereof. But there’s a catch.

In order to