Reader meet author: 2009

Friday, October 30, 2009

Med domänen i centrum

Jag såg ett inslag om överdosering av läkemedel på Astrid Lindgrens Barnsjukhus på Aktuellt härom dagen där en barnläkare berättade om ett datorprogram som användes på sjukhuset:

- Väldigt ofta som doktor tänker jag hur många milligram som patienten ska ha. Men i journalsystemet måste jag ordinera i volym eller antal tabletter. Så jag måste först tänka vad patienten i slutändan ska ha, och sedan gå tillbaka och tänka ut hur patienten ska få det. En önskan hade varit att direkt kunna ange milligram per kilo, och att patientens vikt skulle finnas med i systemet.

Vid något tillfälle hade man tagt fel på milligram och milliliter när man doserat smärtstillande, vilket ledde till en tio gånger för hög dos. Det här kunde fått oerhört allvarliga konsekvenser.

Jag har visserligen ingen insyn i detaljerna kring utvecklingen av det här journalsystemet, men det är väl ingen vågad gissning att man inte arbetat tillsammans med någon som är expert på hur läkemedel doseras när man byggt systemet. Det här är ett smärtsamt tydligt exempel på när idéerna och principerna från domändriven design är avgörande.

DDD handlar i grund och botten om att gå tillbaka och fråga sig varför man över huvud taget bygger mjukvara. Vad ska den användas till, vilken process ska den underlätta? Vi tar vår utgångspunkt i domänen, verksamhetsområdet, och låter den genomsyra arbetet och produkten.

Som programmerare är vi förhoppningsvis experter på att utveckla mjukvara. Vi kan allt om polymorfism, hashnycklar och skillnaden mellan inner och outer join. Däremot är vi väldigt sällan experter på den verksamhet där vår mjukvara ska användas, som i fallet ovan kanske kan beskrivas som journalföring eller helt enkelt medicin.

Det finns alltså ett kunskapsglapp som vi kan tjäna väldigt mycket på om vi kan överbrygga det på ett effektivt sätt. Resultatet, menar jag, blir mjukvara som fungerar bättre och är lättare att förändra över tiden eftersom den är utformad på ett sätt som intimt hänger ihop med hur domänen fungerar.

Jag tänkte visa hur ett arbetssätt inspirerat av DDD hade kunnat undvika att hamna i den här situationen, med en fiktiv berättelse om hur utvecklingen hade gått till.

Notera att jag hittat på detaljerna i dialogen nedan utifrån vad jag läst i nyhetsartiklar, det ska inte ses som medicinska råd eller så.

I den bästa av världar

Vid något tillfälle kan man anta att man kommer fram till en user story i backloggen som ser ut såhär:

Som läkare vill jag kunna ange mängden läkemedel i en dos i patientens journal.

Teamet jobbar redan med en rik domänmodell som fångar upp och organiserar den affärslogik och komplexitet som är relevant för den omfattning av journalystemet man byggt så här långt, i ett väl isolerat och testbart lager av applikationen. Dessutom har man etablerat ett regelbundet samarbete med Doktor Andersson, där man lär sig om hur läkare arbetar med patienters journaler och verifierar att vissa antaganden man gjort är korrekta.

Nu berör man för första gången området dosering av läkemedel, och tar upp ämnet till diskussion med Doktor Andersson.

Engagerad Utvecklare: - I den här sprinten ska vi bygga en del funktioner för dosering av läkemedel, så vi skulle behöva veta lite om hur det går till.

Doktor Andersson: - Ja, alltså man brukar besluta om en dos utifrån patientens tillstånd förstås, och den förs in i journalen. Själva medicineringen sköts av sjuksköterskorna på avdelningen.

EU: - Hur kan en sådan...notering se ut? Säger man notering?

DA: - En ordinering, som det heter, kan vara t ex "Zeffix, IV, 20 mg/4h". Det betyder alltså läkemedlet Zeffix, intravenöst, var fjärde timme.

EU: - Ok. Hur går själva beslutsprocessen till, alltså för en läkare, när man väl har ställt diagnos?

DA: - Vi har ett system som heter FASS, eller det är väl egentligen en katalogisering av i princip alla godkända läkemedel, där man kan söka och bläddra bland läkemedel utifrån vad man behöver behandla och så vidare. Det finns kategorier och underkategorier med ATC-koder. Sånt brukar ni systemvetare vara intresserade av.

EU: -Vi är inte systemv...hrm, ok. Har varje läkemedel en egen, unik ATC-kod?

DA: - Nej, det kan finnas några likartade alternativ för en viss ATC-kod. Ibland finns det bara ett.

EU: - Vad står i FASS om varje läkemedel som är relevant för ordinering då?

DA: - Väldigt mycket. Användningsområde, biverkningar, olika typer av riskfaktorer och rekommenderad dos.

EU: - Hmm...i det här exemplet med Zeffix står det 20 mg, men hur vet man att det ska vara just så mycket? FASS borde väl rimligtvis säga något om koncentrationen, eller?

DA: - Visst, så är det. Man ordinerar alltid verksam substans i milligram per kilo kroppsvikt, så den dosen är individuell och beror på patientens vikt. Dessutom varierar det mellan barn och vuxna, och ibland för gravida och liknande. En sak som skulle vara till otroligt stor hjälp är om man kunde ha någon sorts automatisk omvandling från milligram till det sätt som läkemedlets mängd anges, alltså i milliliter för flytande form, antal tabletter om det kommer i tablettform och så vidare.

EU: - Mm, jag förstår. Kan man säga något om maximal dos? Minimal? Kan man över huvud taget underdosera?

DA: - Jodå, man kan underdosera. Det för en rad problem med sig, som vi kanske inte behöver gå igenom i detalj, men det är i alla fall inte bra. FASS säger inget om maximal eller minimal dos, det är upp till läkaren att besluta.

EU: - Hur mycket brukar det variera? Hur ofta ger man mer än dubbla rekommenderade dosen till exempel? Vore det önskvärt om systemet kunde avgöra om dosen är osedvanligt hög eller låg, och varna eller blockera? Hur skulle man kunna avgöra det i så fall?

DA: - Hmm...det är inte alls ovanligt att man doserar annorlunda än rekommendationen, i viss utsträckning, men det beror förstås på läkemedlet och en del annat. Det vore bra om man åtminstone kunde se hur dosen förhåller sig till rekommendationen, i procent eller liknande.

Så här fortsätter samtalet. Under tiden har man skissat ihop det här på whiteboarden:

Efter mötet med domänexperten sätter sig två personer i teamet ner och parkodar fram ett utkast på hur ordinering av Zeffix till en patient kan se ut, med utgångspunkt i vad man nyligen har lärt sig och det språk som använts under samtalen.


@Test
public void ordinationEnligtRekommendation() {
  Läkemedel läkemedel = FASS.slåUpp("Zeffix");

  Personnummer personnummer = new Personnummer(new LocalDate(1950, 1, 2), 1234);
  Amount<Mass> kroppsvikt = valueOf(75, KILOGRAM);
  Patient patient = new Patient(personnummer, kroppsvikt);

  Koncentration koncentration = Koncentration.mgPerKgKroppsvikt(20);
  Frekvens frekvens = Frekvens.ggrPerDygn(4);
  Ordination ordination = new Ordination(läkemedel, koncentration, frekvens);

  Dos dos = patient.dosVidOrdination(ordination);

  assertThat(dos.läkemedel(), is(läkemedel));

  Amount<Volume> förväntadMängd = valueOf(300, 0.001, MILLI(LITER));
  assertTrue(dos.mängdPerTillfälle().approximates(förväntadMängd));

  assertThat(dos.relativtRekommendation(), is(1.0));
  assertFalse(dos.överdosering());
  assertFalse(dos.underdosering());
}

(Den kompletta (körbara) koden finns på Github för den som är intresserad)

Här kan man tydligt se hur språk, modell och kod hör ihop på ett intimt sätt, och man har fokuserat på att verkligen lösa ett problem i domänen på ett sätt som stämmer överens med domänexpertens uppfattning av hur det fungerar.

Teamet kan nu ta med sig funderingar och frågor som dykt upp under programmeringsfasen in till nästa modelleringsmöte, som till exempel hur man hanterar läkemedel som kommer i tabellform, något som den första versionen av modellen inte klarar av.

Vad av det här är DDD?

Domändriven design är inget ramverk och ingen process, utan brukar lite luddigt beskrivas som ett förhållningssätt till att utveckla mjukvara, en uppsättning principer, en filosofi. Det är många små beståndsdelar i samverkan, och i exemplet ovan har jag försökt illustrera några av de viktigaste. Här sammanfattar jag dem i punktform:

Sätt domänen i centrum och prata med en domänexpert

Leta reda på en person som har lång erfarenhet av domänen, som vet hur verksamheten fungerar och varför. Försöka utvinna så mycket kunskap som möjligt från den här personen för att kunna fatta bättre beslut vid designen av koden.
Utforma ett gemensamt språk

Identifiera viktiga termer och begrepp i domänexpertens sätt att uttrycka sig. Säkerställ att ni är överens om betydelsen. Inför nya begrepp om det behövs.
Bygg en modell

Ställ upp tänkbara scenarier. Peka på rutorna på whiteboarden och förklara högt hur de ingående delarna kan kombineras för att lösa problemet. Om nåt känns avigt, tänk om, förändra modellen. Experimentera.
Representera modellen i kod

En modell som inte fungerar när man implementerar den i kod är i princip värdelös, så börja programmera så fort som möjligt. Jobba testdrivet genom att ställa upp scenarier med hjälp av testklasser. Använd det gemensamma språket för att döpa klasser, metoder och paket. Sträva efter att få koden att berätta vad den håller på med på ett sätt som en domänexpert kan förstå.
Använd byggstenarna

DDD handlar till stor del om att utnyttja kraften i objektorientering, och lyfter fram ett antal designmönster som stöd för att organisera koden i modellen och hålla nere komplexiteten. Studera och använd byggstenar som entity, value object, repository med flera.
Fokusera på det viktiga

Fundera på vad som verkligen är viktigt och unikt för den här produkten, och fokusera på det. För alla områden som är relaterade till men inte unika för den här domänen, utnyttja någon annans arbete. Saker som tid- och datumhantering eller manipulation av enheter och stoheter, massa och volym har andra med stor sannolikhet redan stött på och byggt verktyg för.
Återkoppla erfarenheter från koden

När man omsätter kunskap och modell i konkret kod dyker nya utmaningar och frågor upp. Använd dem för att förändra och förbättra modellen, och ta med dem till nästa möte med domänexperten för att få stöd att fatta bättre designbeslut.

Reklam: Citerus gillar DDD och vi vill gärna dela med oss av vår kunskap. Kika in på citerus.se/ddd för mer information.

Thursday, October 29, 2009

Geeky fact of the day, proven

A few weeks back, my colleague Patrik Fredriksson turned my attention to a "Geeky fact of the day" tweet by Josh Bloch by verifying Josh's claim for a small number of integers using Clojure. Having recently picked up a discrete mathematics book in an attempt to refresh my academic skills, I found an exercise that wanted you to prove exactly that, so I gave it a shot.

Clojure is great, no doubt about that, but mathematics also has its strong points: it's very stable, tool support is great and it scales tremendously well :-)

So, we want to prove that the sum of the n first cubes is equal to the sum of the first n positive integers squared:


1³ + ... + n³ = (1 + ... + n)²

I'm going to use the principle of induction, which means that you start out with a concrete, simple case and show that the theorem holds for that. Then you assume that it's true for some arbitrary value k and show that the theorem then holds for k + 1. By virtue of a domino effect from your base case, the theorem is proven for all natural numbers.

Let's start with the induction basis:


1³ = 1 = 1²

So, it's obviously true for n = 1. Now for the induction hypothesis - assume that the theorem holds for n = k:


1³ + ... + k³ = (1 + ... + k)²

Let's take a look at the right hand side expression evaluated for k + 1:


  ((1 + ... + k) + (k + 1))² =
= (1 + ... + k)² + (k + 1)² + 2(k + 1)(1 + ... + k) =
= (1 + ... + k)² + (k + 1)((k + 1) + 2(1 + ... + k))

I'm using the familiar expansion of (a + b)² and then factoring out (k + 1) from the last two terms. Focusing for a moment on the factor in bold:


  ((k + 1) + 2(1 + ... + k)) =
= ((k + 1) + 2(1 + ... + k - 1) + 2k) =
= (k + 2(1 + ... + k - 1) + 1 + 2k)

Narrowing in on the second term in this expression, we can use a clever trick:


  2(1 + ... + k - 1) =

=   1   +   2   + ... + k - 1 +
+ k - 1 + k - 2 + ... +   1   =

= (k - 1 + 1) + (k - 2 + 2) + ... + (1 + k - 1)

Using this symmetry we can deduct that


2(1 + ... + k - 1) = k(k - 1)

since there are k - 1 expressions each evaluating to k in the summation table above. Injecting this back we have


k + k(k - 1) + 1 + 2k = k² + 1 + 2k

And this in turn back into the full right hand side expression:


  (1 + ... + k)² + (k + 1)(k² + 1 + 2k) =
= (1 + ... + k)² + (k + 1)(k + 1)² =
= (1 + ... + k)² + (k + 1)³

But


(1 + ... + k)² = 1³ + ... + k³

according to the hypothesis, so


(1 + ... + k)² + (k + 1)³ = 1³ + ... + k³ + (k + 1)³

which is the left hand side of the original expression, for k + 1. This means that the induction hypothesis holds, and the theorem is proven.

Q.E.D.

Friday, April 17, 2009

Running Spring on Google App Engine

In case you've been living under a rock the last couple of weeks, Google recently announced the addition of Java support to its App Engine. I have written a small sample application that leverages Spring and each of the Google infrastructure services that are exposed either as proprietary APIs or serves as backend to standard APIs.

The application itself, Feeling Lucky Pictures, is very simple: you login using your Google account, and you import images from URLs into your personal gallery. You can also send an email with your pictures attatched.

Integration with the Google infrastructure is as follows:

Authentication is of course done with the Google Accounts API.
Image import uses a regular java.net.URL input stream, but on the GAE runtime that class is backed by the URL Fetch API.
Storage is implemented using JDO, which is backed by a BigTable data store.
Reading objects from the data store uses the JSR 107 javax.cache API, which is backed by Memcache.
Imported images are enhanced with the I'm feeling lucky filter, part of the Images API.
Email is sent with the standard JavaMail API, which also is backed by a custom mail service on GAE.

As I said, the application is built on Spring as I wanted to see what problems, if any, I would run into if I were to run a regular Spring application on GAE. Most things worked out fine, but there were a handful of issues that I discovered and resolved, which is the real value of this application.

I decided to make as heavy use of the annotation configuration option as possible, including automatic classpath scanning. However, Spring will then attempt to scan for JPA annotations if certain conditions are met, which in turn will cause the javax.naming.NamingException class to load.

This class is not on the GAE class whitelist, and working around the missing class by providing a JNDI API jar or a dummy class won't work either. My solution was to roll a JDO-only version of the spring-orm jar. This problem is also discussed in this thread, with alternative solutions.
When configuring the LocalPersistenceManagerFactoryBean, don't specify a configuration file property (i.e. "classpath:META-INF/jdoconfig.xml"). Instead, set the persistenceManagerFactoryName property to "transactions-optional", which is the name used in the default configuration file provided by the Eclipse GAE plugin.
I wanted to avoid using the various Google API static factories programmatically in my MVC controllers, instead injecting service interfaces like ImageService to improve testability. This worked fine for the most part, simply using the factory-method attribute in the bean definition, with the exception of the Cache interface.

Cache extends java.util.Map, and for some reason the @Autowire mechanism requires generic key and value types on constructor parameters of Map type. Unfortunately the Cache interface is not possible to parameterize, so my workaround was to embed it in a very simple CacheHolder one-property class, which is produced by a factory bean and injected.
I'm caching picture ids per email address, like this: "foo@bar.com":[1,5,7]. In a regular in-memory HashMap cache you can add elements to a collection map value, but on GAE you need to overwrite the old entry with a new collection that contains the old elements plus the new one. See JdoRepository for details.
Spring provides a set of abstractions on top of JavaMail, which I wanted to keep. In particular, the JavaMailSender interface and MimeMessageHelper class, for sending emails with attachments. This turned out to be a bit cumbersome, since the GAE mail backend is wired into the implementation of the JavaMail API in an unusual way. There's no SMTP transport, for example.

What I had to do was to override Spring's implementation of JavaMailSender and align creation and usage of javax.mail.Session and javax.mail.Transport exactly with the howto instructions for the GAE mail service. Basically you can't let the Transport connect and then send the message on the established connection, you need to do it all in one pass using the static Transport.send() method.

Oh, and don't leave the body of an email empty, or you'll get a really poor error message.

Other than that it's business as usual. The complete source code is available on Google Code, and it's MIT licensed so you can do whatever you want with it. If it helped you out, or if you've found a better or simpler solution to any of the problems above, please drop a comment on this article.

Friday, March 27, 2009

DDDSample 1.1.0 released

What the title says :-)

It's been 6 months since 1.0, and quite a lot has happened. If you're interested in domain-driven design, take a look at it and let us know what you think, either on the international Yahoo DDD group or the Swedish Google group.

If you're interested in learning more about DDD, Citerus has a variety of offers for you and your team.

Monday, March 23, 2009

Five tips for successfully deploying Maven

Maven is one of those things that people seem to hate rather intensely, but nevertheless adoption is steadily rising in the Java community. I've worked with Maven almost daily since the 1.0 betas, and here are five things that I think could help your team working more efficiently with Maven.

Use a repository manager

A repository manager is basically an on-demand mirroring repository cache that you set up inside your IT infrastructure and use as primary repository for your builds. They basically work like this: if you build a project that depends on, for example, commons-lang-2.4.jar, the repository manager will download the artifact from the main Maven repository on the web, cache it locally and return it to the build client that asked for it. All subsequent builds that use the same managed repository will get the commons-lang jar delivered from the cache, not from the web.

This has many advantages. First of all, it's fast. All project members, except the first one, will download any given dependency at LAN speed, which is especially nice when you're setting up a build environment from scratch (new project member, staging a clean build, etc). And of course it saves external bandwidth for other purposes and to lower costs.

Second, it's safer. It allows you to run centralized and incremental backups on all external dependencies that you projects use, and you reduce your dependency on the availability of public repositories.

Third, it's convenient. From time to time you will need a library that's not (yet) available in any public repository, so you have to publish it somewhere. A repository manager makes that really easy. And if you're sharing internal libraries or interfaces between projects, it's extremely handy to deploy to the managed repository. You can even set up your continuous integration build to automatically deploy snapshots.

I've had a pleasant experience working with Nexus, but there are others. A repository manager should be as natural a part of you infrastructure as SCM and CI if you're using Maven.
Specify plugin versions

By default, Maven will automatically download a new version of any given plugin whenever there is one available. Given that Maven is 99% made up of plugins (there's even a plugin plugin!), this is a potential point of breakage over time and in my opinion a design mistake.

As of version 2.0.9, the default behaviour is improved by locking down the versions of the core plugins (where "core" is defined by this list). However, you still need to explicitly define versions for all non-core plugins, and that can be done at the top level pom.xml in a hierarchial project using the pluginManagement section.
```
<pluginManagement>
  <plugins>
      <plugin>
          <artifactid>maven-assembly-plugin</artifactid>
          <version>2.2-beta-2</version>
      </plugin>
      <plugin>
          <artifactid>maven-antrun-plugin</artifactid>
          <version>1.2</version>
      </plugin>
  </plugins>
</pluginManagement>
```
Do this for the plugins that you actually use. Note that for plugins with group id org.apache.maven.plugin, you can omit the groupId element.

This will make your builds more stable and eliminate a fairly rare but very annoying and confusing set of problems.
Learn how to use the dependency plugin

Maven introduced the concept of transitive depedencies to the Java community, and has been a source of confusion ever since. The dependency plugin is an invaluable tool for analyzing the results of the dependency algorithm, and to handle dependencies in various ways. Here are a couple of things you can do with it:
- ```
dependency:tree
```
  shows (you guessed it) the dependency tree for the project, what dependencies are being pulled in and why. It's a nice overview and can help you tweak the dependency structure by excluding artifacts or override versions and so on. Example output:
```
[INFO] +- org.apache.activemq:activemq-core:jar:5.2.0:compile
[INFO] |  +- org.apache.camel:camel-core:jar:1.5.0:compile
[INFO] |  +- org.apache.geronimo.specs:geronimo-jms_1.1_spec:jar:1.1.1:compile
[INFO] |  +- org.apache.activemq:activeio-core:jar:3.1.0:compile
[INFO] |  |  \- backport-util-concurrent:backport-util-concurrent:jar:2.1:compile
[INFO] |  \- org.apache.geronimo.specs:geronimo-j2ee-management_1.0_spec:jar:1.0:compile
```
- ```
dependency:go-offline
```
  Downloads all project dependencies and plugins, transitively. It's a good command to run both if you want to work offline for a while and if you want to get as many of the external dependencies in place in a single shot with no manual intervention while you go grab a cup of coffee and/or read another item in Effective Java ;-)
- ```
dependency:copy
dependency:copy-dependencies
```
  If you ever need to handle artifacts as files, copying all or some of them to a custom location for whatever reason, this is a good approach.
There are many more things you can do with it, and mastering it will help you get on top of the transitive dependency situation.
Use the documentation

Well, duh. But a weak point of Maven in the eyes of many people is the lack of documentation and the sometimes poorly organized information. There are a few good points of reference though, that you can spread around you team by setting up links on the Wiki for example:
- The Definitive Guide to Maven: a free book from Sonatype, available both as HTML and PDF. Good for the beginner, and sometimes as a reference. If you don't know where to start, start here.
- The plugin list: a comprehensive list to the official plugins, with links to each project page and JIRA subsection. Most of the core functionality is actually performed by one of these plugins, and you can learn a lot by studying things like the resources plugin documentation.
- The POM reference: for the slightly more advanced user. Every element in the POM is explained. Don't forget to specify the XSD information in your POM file to get the most help from your XML editor.
Understand the conventions

Maven is a conventions-based tool, relieving you from scripting common task like compiling source code, running tests or packaging a web application into a war file. Learning the conventions - directory structure, build phases - and working along them will make your life easier a lot of the time.

There are definitely situations even in moderately sized projects to customize the build however, and Maven can sometimes be quite cumbersome to work with when you need to break the conventions. But by understanding the conventions and having the mindset that there is a good chance what you're trying to do can be accomplished within the realms of the conventions, you might be able to find a different approach than you otherwise might have.

Perhaps that ugly jar-splitting, file-copying, token-replacing antrun hack that you spent an agonizing week writing could be replaced by extracting part of the project into a separate module and included as a dependency instead? It's a lot easier to swim downstream than upstream.

Maven is not perfect by any means, but it has brought standardization and conventions to the world of Java development. Project structure, directory structure, public metadata and artifact repository publishing to name a few. There are lots of plugins available, both central and third-party ones, and most IDEs and continuous integration servers support Maven very well.

A lot of the standardization may even outlive the Maven tool itself, as demonstrated by two newer build system for Java: Buildr and Gradle. They both use many of the same conventions, and could challenge Maven by perhaps being able to scale down in complexity more easily and have a lower threshold for newcomers. Progress on Maven slowed down a bit after 2.0, but recently the 2.1.0 version was released with a number of important improvments, for example parallel resolution of depenencies.

Friday, January 16, 2009

New PNEHM article

I'm about to publish my next PNEHM article, a step-by-step port of a short algorithm written in Java to Groovy. For my Swedish readers, here's a sneak peak (the article is in Swedish).

UPDATE: the article is now published. Enjoy!

Reader meet author