Monday, March 23, 2009

Five tips for successfully deploying Maven

Maven is one of those things that people seem to hate rather intensely, but nevertheless adoption is steadily rising in the Java community. I've worked with Maven almost daily since the 1.0 betas, and here are five things that I think could help your team working more efficiently with Maven.

  1. Use a repository manager

    A repository manager is basically an on-demand mirroring repository cache that you set up inside your IT infrastructure and use as primary repository for your builds. They basically work like this: if you build a project that depends on, for example, commons-lang-2.4.jar, the repository manager will download the artifact from the main Maven repository on the web, cache it locally and return it to the build client that asked for it. All subsequent builds that use the same managed repository will get the commons-lang jar delivered from the cache, not from the web.

    This has many advantages. First of all, it's fast. All project members, except the first one, will download any given dependency at LAN speed, which is especially nice when you're setting up a build environment from scratch (new project member, staging a clean build, etc). And of course it saves external bandwidth for other purposes and to lower costs.

    Second, it's safer. It allows you to run centralized and incremental backups on all external dependencies that you projects use, and you reduce your dependency on the availability of public repositories.

    Third, it's convenient. From time to time you will need a library that's not (yet) available in any public repository, so you have to publish it somewhere. A repository manager makes that really easy. And if you're sharing internal libraries or interfaces between projects, it's extremely handy to deploy to the managed repository. You can even set up your continuous integration build to automatically deploy snapshots.

    I've had a pleasant experience working with Nexus, but there are others. A repository manager should be as natural a part of you infrastructure as SCM and CI if you're using Maven.

  2. Specify plugin versions

    By default, Maven will automatically download a new version of any given plugin whenever there is one available. Given that Maven is 99% made up of plugins (there's even a plugin plugin!), this is a potential point of breakage over time and in my opinion a design mistake.

    As of version 2.0.9, the default behaviour is improved by locking down the versions of the core plugins (where "core" is defined by this list). However, you still need to explicitly define versions for all non-core plugins, and that can be done at the top level pom.xml in a hierarchial project using the pluginManagement section.

    <pluginManagement>
    <plugins>
    <plugin>
    <artifactid>maven-assembly-plugin</artifactid>
    <version>2.2-beta-2</version>
    </plugin>
    <plugin>
    <artifactid>maven-antrun-plugin</artifactid>
    <version>1.2</version>
    </plugin>
    </plugins>
    </pluginManagement>

    Do this for the plugins that you actually use. Note that for plugins with group id org.apache.maven.plugin, you can omit the groupId element.

    This will make your builds more stable and eliminate a fairly rare but very annoying and confusing set of problems.

  3. Learn how to use the dependency plugin

    Maven introduced the concept of transitive depedencies to the Java community, and has been a source of confusion ever since. The dependency plugin is an invaluable tool for analyzing the results of the dependency algorithm, and to handle dependencies in various ways. Here are a couple of things you can do with it:

    • dependency:tree
      shows (you guessed it) the dependency tree for the project, what dependencies are being pulled in and why. It's a nice overview and can help you tweak the dependency structure by excluding artifacts or override versions and so on. Example output:

      [INFO] +- org.apache.activemq:activemq-core:jar:5.2.0:compile
      [INFO] | +- org.apache.camel:camel-core:jar:1.5.0:compile
      [INFO] | +- org.apache.geronimo.specs:geronimo-jms_1.1_spec:jar:1.1.1:compile
      [INFO] | +- org.apache.activemq:activeio-core:jar:3.1.0:compile
      [INFO] | | \- backport-util-concurrent:backport-util-concurrent:jar:2.1:compile
      [INFO] | \- org.apache.geronimo.specs:geronimo-j2ee-management_1.0_spec:jar:1.0:compile
    • dependency:go-offline
      Downloads all project dependencies and plugins, transitively. It's a good command to run both if you want to work offline for a while and if you want to get as many of the external dependencies in place in a single shot with no manual intervention while you go grab a cup of coffee and/or read another item in Effective Java ;-)

    • dependency:copy
      dependency:copy-dependencies
      If you ever need to handle artifacts as files, copying all or some of them to a custom location for whatever reason, this is a good approach.

    There are many more things you can do with it, and mastering it will help you get on top of the transitive dependency situation.

  4. Use the documentation

    Well, duh. But a weak point of Maven in the eyes of many people is the lack of documentation and the sometimes poorly organized information. There are a few good points of reference though, that you can spread around you team by setting up links on the Wiki for example:

    • The Definitive Guide to Maven: a free book from Sonatype, available both as HTML and PDF. Good for the beginner, and sometimes as a reference. If you don't know where to start, start here.

    • The plugin list: a comprehensive list to the official plugins, with links to each project page and JIRA subsection. Most of the core functionality is actually performed by one of these plugins, and you can learn a lot by studying things like the resources plugin documentation.

    • The POM reference: for the slightly more advanced user. Every element in the POM is explained. Don't forget to specify the XSD information in your POM file to get the most help from your XML editor.

  5. Understand the conventions

    Maven is a conventions-based tool, relieving you from scripting common task like compiling source code, running tests or packaging a web application into a war file. Learning the conventions - directory structure, build phases - and working along them will make your life easier a lot of the time.

    There are definitely situations even in moderately sized projects to customize the build however, and Maven can sometimes be quite cumbersome to work with when you need to break the conventions. But by understanding the conventions and having the mindset that there is a good chance what you're trying to do can be accomplished within the realms of the conventions, you might be able to find a different approach than you otherwise might have.

    Perhaps that ugly jar-splitting, file-copying, token-replacing antrun hack that you spent an agonizing week writing could be replaced by extracting part of the project into a separate module and included as a dependency instead? It's a lot easier to swim downstream than upstream.

Maven is not perfect by any means, but it has brought standardization and conventions to the world of Java development. Project structure, directory structure, public metadata and artifact repository publishing to name a few. There are lots of plugins available, both central and third-party ones, and most IDEs and continuous integration servers support Maven very well.

A lot of the standardization may even outlive the Maven tool itself, as demonstrated by two newer build system for Java: Buildr and Gradle. They both use many of the same conventions, and could challenge Maven by perhaps being able to scale down in complexity more easily and have a lower threshold for newcomers. Progress on Maven slowed down a bit after 2.0, but recently the 2.1.0 version was released with a number of important improvments, for example parallel resolution of depenencies.

6 comments:

Dave said...

Tip 0: Clearly define the targets to use and what they do.

For someone new to maven, since there is no way to ask maven for the list of available targets (and if there were, the list would be unusably long), it is a great help to document things like "how to run integration tests", "how to run one test", "how to compile only", etc. Determining these things is a baffling ordeal.

Once you've determined them, understanding what they actually DO is very difficult to determine; whoever is configuring maven can do themselves and their team a huge favor by documenting what each commonly-needed target actually does.

Peter Backlund said...

While I agree that it would be useful to have to command that lists the targets, I think that a strong point of having a standardized build lifecycle is precisely that you only need to learn these things once. Any project that uses Maven has the same targets/build phases.

Combining points 4 and 5, you would probably find this lifecycle reference useful:

http://www.sonatype.com/books/maven-book/reference/lifecycle.html#tbl-default-lifecycle

There's also this page on the main Maven web page, but it's a bit hard to find:

http://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html

Anders said...

Regarding your first tip, using a repo manager, I agree on the benefits you list. However, I think that one of the most important things (if not the most important) is that you enable a central point where you can do stuff like:
* check license conditions
* do security checks (pgp for instance)
* block specific artifacts, or (maybe more likely) just allow specific ones

Most of these can be done by the developers/projects themself. But, centralizing things guarantees that it is being done. Also, a centralized point enables future changes/improvements without having all projects to update.

Anders said...

Dave: I think that the great thing about Maven is that it standardizes things about building. When having ant scripts, I agree that documenting "how to build", "how to run int test", etc is a must. However, for Maven, I don't think that every project should do this. It's a waste of time. The Maven book by Sonatype that Peter linked to, is great and explains everything. The sweet thing is that once you know it you don't have to re-learn for your next project. :-)

Chris W. Hansen said...

Great, concise list. I wish we had it a year ago when my employer first started the transition to Maven. Our story seems to be a common one: initially Maven was a pain, but we eventually grew to tolerate it. Now, it mostly stays out of the way and, among other benefits, our deployment process is vastly improved because of it.

@Anders
Good points about repo managers..

Marco said...

Thanks for tipp #1 - the repository manager. Didn't even know that such kind of manager exists but missed it all the time. :)