Recently in java Category

PropJoe

| | Comments (0)

I have long had this irritating problem with .properties. It goes something like this:

  • I like to use .properties files for configuration unless there is a compelling reason to use something more complicated.
  • I'm pretty uptight about using constants in my code. This includes using constants to refer to the names of properties in a .properties file.
  • Over time, the constants and the .properties file can drift apart - you have to keep them both in sync.
  • Similarly, properly documenting them both becomes a painful

So, I proceeded to finally do something about it. I spent about 15 minutes writing the code and 45 minutes writing the documentation, go figure why. Maybe it's because The Wire is over and I think I just needed to do something to say goodbye. At any rate, I have found it has made my life just a little bit easier.

With that, I give you, PropJoe.

Download propjoe-0.1.0.zip


Package Description for PropJoe:

Provides a javadoc doclet for generating a .properties file from annotated Java constants. This simplifies the task of maintaining consistency and documentation between the .properties file and Java code which references values in it.

Motivation

Properties files are ubiquitous. They're easy to use and everyone understands them. They are the way to go for relatively simple application configuration or to externalize program constants that might occassionally have to be changed in the field.

If your app has a .properties file with more than a couple of propertie in it, you hopefully you are using constants in your code to refer to the names of the properties. So, for example, you end up with a class that holds a bunch of constants that you pass to getProperty() when you need to get a property. Something like this:

public interface PropertyNames {

  public String LISTEN_PORT = "listen.port";

  public String BIND_ADDRESS = "bind.address";

  public String FACTORY_CLASS = "factory.class";

  //...
}

And then of course you end up with a .properties file like this that you need to ship with your product.


#
# acme product.properties
#
listen.port  = 7001
bind.address = 127.0.0.1
factory.class = com.acme.AcmeFactoryImpl 

This is all well and good and hopefully we can just assume that all of this accepted good practice. However, there are a few problems here:

Keeping the interface and the .properties file in sync as the number of properties grows is problematic.

Also, documentation is similarly problematic. We can document the Java interface, which our developers will like. We can document the properties file, which our users will like. Or we can try to document them both and slowly go insane as the number of properties increases.

PropJoe

This is a basic single-sourcing problem, and it is this problem that PropJoe solves in two easy steps:

  • Apply PropJoe's @Property annotations to the constants on the Java interface
  • Run PropJoe javadoc doclet to generate a .properties files

To modify our example above, we'd have something like this on the Java side:

public interface PropertyNames { /** * The port that the server will listen on */ @Divider("Acme Product Properties") @Property("7001") public String LISTEN_PORT = "listen.port"; /** * The address the server server will bind to */ @Property("127.0.0.1") public String BIND_ADDRESS = "bind.address"; /** * Fully-qualified class name of the special acme factory to instantiate. */ @Property("com.acme.AcmeFactoryImpl") public String FACTORY_CLASS = "factory.class"; //... }

Which we could run through the PropJoeDoclet to get an output .properties file like this:

##############################################################
# Acme Product Properties

# The port that the server will listen on
listen.port  = 7001

# The address the server server will bind to
bind.address = 127.0.0.1

# Fully-qualified class name of the special acme factory to instantiate
factory.class = com.acme.AcmeFactoryImpl 

Now, there is no more synchronization problem between the two files, and the .properties file is nicely documented to boot. All we have to do is worry about a single Java file.

Running the Doclet

Run PropJoe like you would any other doclet. The only thing special about it is that it requires one doclet parameter, -f, to specify the path of the output properties file. Note that if the file already exists, the output will be appended to the file.

Also note that you can scan as may input Java files as desired for Property annotations - if you have those constants sprinkled all over your codebase, that's fine - just feed all of the source to javadoc and PropJoe will find them.

For example, from the command line you would simply

  javadoc -doclet net.pcal.propjoe.PropJoeDoclet ... -f my/output/dir/product.properties

or in Ant using the javadoc ant task:

    <javadoc packagenames='com.acme.product.*'  
             docletpathref='path.to.propjoe.jar'
                         failonerror='true'>
      <fileset dir='bootstrap/src'>
        <include name='**/PropertyNames.java'/>
      </fileset>
      <doclet name='net.pcal.propjoe.PropJoeDoclet'>
                 <param name='-f' value='my/output/dir/product.properties'/>
      </doclet>
    </javadoc>

I've been using ant as a scripting language for assembling and deploying the new CFOsoft.com website. A few embarrassing typos later, I realized it would be nice to also be able to spellcheck it in the same process.

A quick google search turned up Rob Mayhew's AntSpell task. It works as advertised, no muss no fuss. 10 minutes later, I had it integrated into my build with a custom dictionary. Thanks Rob!

Scriptella ETL

|

My current project involves a lot of transforming and migrating database data, so some kind of ETL framework is called for. I spent some time researching the various Java ETL offerings and surprisingly found there is not a whole lot out there.

Fortunately, I did find one which is meeting my needs in a very elegant way: Scriptella

Their website says

"Our primary focus is simplicity. You don't have to study yet another complex XML-based language"

I second that emotion; the last thing I need to do is deal with another misguided attempt to stuff procedural logic into XML.

At first glance, though, Scriptella looks like nothing if not "yet another complex XML-based language:"

<etl>
    <connection driver="$driver" url="$url" user="$user" 
                               password="$password"/>
    <script>
        <include href="PATH_TO_YOUR_SCRIPT.sql"/>
        -- And/or directly insert SQL statements here
    </script>
</etl>

Do not be fooled. That is just a shell, and it's about all of the XML you will have to write.

The Rules

For the impatient, here is a sketch of a spec for how you write a Scriptella script:

  • You use XML to build a simple skeleton for nesting chunks of declarative languages (typically SQL but not always)
  • Outer chunks are responsible for generating a result set - (a list of hashmaps, essentially)
  • Inner chunks are processed iteratively for each row in the result set generated by its enclosing chunk
  • Inner chunks do something useful with each row (such as perform an INSERT). They can refer to members of the current result set row by name.
  • Chunks can be nested to any depth (i.e., inner chunks can also be outer chunks)

A Simple Example

That may sound a little confusing, but it's actually quite elegant and powerful. A simple example goes a long way.

Say you have an Employee table in Database A that you want to copy into Database B, row by row. In scriptella, it would look something like this:

...
<query connection-id='DatabaseA'>
  SELECT ID, FIRST, LAST FROM EMP
  <script connection-id='DatabaseB'>
    INSERT INTO EMPCOPY VALUES (?ID, ?FIRST, ?LAST)
 </script>
</query>

When processed, this Scriptella script would select each row out of EMP in Database A and insert them into EMPCOPY in Database B. The nested 'script' element gets processed once for each row, and the '?' parameters get filled in with values from each row.

It's as if you wrote Java code that opened two JDBC connections, executed a query on one, iterated through the result set, and dumped each row into a prepared INSERT statement opened on the other connection.

At this point, you can knock yourself out - create tables from scratch, write arbitrarily complex SELECT statements, create and drop tables as needed, whatever. You can also use this as a way to write queries that generate queries - a more dialect-neutral alternative to the various flavors of dynamic SQL (i.e., queries that generate queries). You have the full power of SQL at your disposal and Scriptella won't get in the way unless you ask it to.

Beyond SQL

Things start to get really interesting when you consider that Scriptella is designed to accommodate any language, not just SQL. For example, something like this can be used to dump data from a table to System.out:

...
<query connection-id='DatabaseA'>
  SELECT ID, FIRST, LAST FROM EMP
   <script connection-id='janino-java-connection'>
    String employee = get("FIRST");
    System.out.print("Found an employee named "+employee);
    </script>
</query>

Any language that you have a JSR-223 driver for can be used. You can have scripts that generate data for processing by enclosed XML chunks as well as processing data from enclosing chunks

Wrapping Up

There are a lot more (and better) examples on the Scriptella website. The short of it, though is that Scriptella gives you a simple, elegant, declarative mechanism for:

  • Moving data between databases
  • Writing dynamic queries
  • Sprinkling in procedural logic when you need it

Most importantly, it does all of this with as little intrusion as possible. Most of the time, I want to do the heavy lifting with SQL, and Scriptella doesn't get in the way of that. But when I need to add some procedural logic or do something trickier, Scriptella is there.