1. What is spring-boot-starter-batch-web?

Java batch is becoming a hot topic in enterprise environments these days, but how do you do it the right way? The project spring-boot-starter-batch-web offers a best practice approach to modern batch architectures, answering the following questions:

  • How do I actually deploy jobs?

  • How do I start, stop and monitor them?

  • How do I integrate them into my companies' infrastructure?

  • How do I build job artifacts?

It builds on Spring Batch, Spring Boot and JSR-352.

You may want to switch to the Getting Started page. If you want to know about the possibilities for configuring the application, switch to Configuration options.

2. Blog roll

3. Getting Started

  1. Create a Spring Boot maven project. Take a look at this example, it’s pretty much just using spring-boot-starter-parent as a parent und adding the Spring Boot build plugin. Then add the dependency to spring-boot-starter-batch-web like in the example. Check for the current version in the Release Notes.

<dependency>
        <groupId>de.codecentric</groupId>
        <artifactId>spring-boot-starter-batch-web</artifactId>
        <version>1.3.4.RELEASE</version>
</dependency>
  1. If you have a database with the Spring Batch meta data tables and your business data, add the connection properties to the application.properties like in this example. If you don’t specify these properties you’ll get an in-memory database for the Spring Batch meta data tables.

  2. Add a simple logback.xml for logging. Here’s an example inheriting from our basic log configuration to support log file separation.

  3. Add a batch job. You may define it in XML and put it into META-INF/spring/batch/jobs (overridable via property batch.config.path.xml) or in JavaConfig and put it into the package spring.batch.jobs (overridable via property batch.config.package.javaconfig). Each XML file or class annotated with @Configuration in the specified locations will get its own child ApplicationContext. Third option is defining a JSR-352 style job in XML and adding it to META-INF/batch-jobs.

  4. Add an entry point to the application, a class with a main method invoking SpringApplication.run(…​). Take a look at this example.

  5. Build the application via maven package. Then start the application using java -jar xxx.jar.

Default port is 8080. Take a look at the JavaDoc of these controllers to get to know the endpoints for starting jobs etc.: JobOperationsController and JobMonitoringController.

4. Configuration options

There are two ways to influence spring-boot-starter-batch-web’s behaviour. The first way is to set certain properties, the second way is to add certain components to the ApplicationContext. Let’s take a look at the available properties first. Note that these are only the properties of spring-boot-starter-batch-web, there are more properties from Spring Boot described in the Spring Boot reference documentation.

Property name Description Default value

batch.config.path.xml

Location in the classpath where Spring Batch job definitions in XML are picked up.

/META-INF/spring/batch/jobs

batch.config. package.javaconfig

Package where Spring Batch job definitions in JavaConfig are picked up.

spring.batch.jobs

batch.defaultprotocol. enabled

Whether the default job protocol printed into the log is activated.

true

batch.logfileseparation. enabled

Whether writing one log file for each job execution is activated.

true

batch.metrics.enabled

Whether the transaction safe batch metrics framework is activated so that BatchMetrics may be injected and used.

false

batch.metrics.profiling. readprocesswrite.enabled

Readers, Processors and Writers are profiled with RichGauges when set to true.

false

batch.core.pool.size

Core pool size of the thread pool.

5

batch.queue.capacity

Queue capacity of the task executor.

Integer.MAX_VALUE

batch.max.pool.size

Max pool size of the thread pool.

Integer.MAX_VALUE

batch.repository. isolationlevelforcreate

Database isolation level for creating job executions.

Spring Batch’s default

batch.repository. tableprefix

Prefix for Spring Batch meta data tables.

Not set

batch.web.operations.base

Base URL for the operations endpoint.

/batch/operations

batch.web.monitoring.base

Base URL for the monitoring endpoint.

/batch/monitoring

Now let’s take a look at the interfaces / abstract classes you may implement to add those implemented components to the ApplicationContext:

Example:

private static class SimpleMetricsOutputFormatter implements MetricsOutputFormatter {

    @Override
    public String format(List<RichGauge> gauges, List<Metric<?>> metrics) {
        StringBuilder builder = new StringBuilder("\n########## Metrics Start ##########\n");
        if (gauges != null) {
            for (RichGauge gauge : gauges) {
                builder.append(gauge.toString() + "\n");
            }
        }
        if (metrics != null) {
            for (Metric<?> metric : metrics) {
                builder.append(metric.toString() + "\n");
            }
        }
        builder.append("########## Metrics End ############");
        return builder.toString();
    }

}

5. Initscript Template

This is a simple example for an initscript template to control your Spring Boot batch application:

#!/bin/sh
#
# Simple initscript for a Java application
#

# Verifies the status of the application
verify_status() {
  [...]
}

# Starts the application
start() {
  # First check if the application is already started
  verify_status
  # Start JVM
  java -jar /opt/batch/example-batch/example-batch.jar >> /var/log/example-batch.log 2>&1 &
}

# Stopps the application
stop() {
  [...]
}

# Shows the application status on the console
status() {
  [...]
}

case "$1" in
  start)
    start
    ;;
  stop)
    stop
    ;;
  status)
    status
    ;;
  restart)
    stop
    start
    ;;
  *)
    echo "Usage: $0 {start|stop|restart|status}"
    exit 1
esac

exit 0

6. RPM

The following code snippet may help you to configure the Maven RPM Plugin, which creates an RPM during the build process for you. In this example the RPM will be created as an attached artifact. For more information see the documentation of the plugin.

<plugin>
        <groupId>org.codehaus.mojo</groupId>
        <artifactId>rpm-maven-plugin</artifactId>
        <version>2.1-alpha-4</version>
        <extensions>true</extensions>
        <executions>
                <execution>
                        <phase>package</phase>
                        <goals>
                                <goal>attached-rpm</goal>
                        </goals>
                </execution>
        </executions>
        <configuration>
                <name>${project.name}</name>
                <version>${project.version}</version>
                <copyright>2014,codecentric</copyright>
                <distribution>sles11</distribution>
                <group>Application/codecentric</group>
                <packager>jenkins</packager>
                <changelogFile>src/changelog</changelogFile>
                <vendor>sles</vendor>
                <defineStatements>
                        <defineStatement>_unpackaged_files_terminate_build 0</defineStatement>
                </defineStatements>
                <mappings>
                        <mapping>
                                <directory>/opt/batch/${project.name}/</directory>
                                <filemode>755</filemode>
                                <username>batch</username>
                                <groupname>batch</groupname>
                                <sources>
                                        <source>
                                                <destination>${project.name}.jar</destination>
                                                <location>target/${project.artifactId}-${project.version}.jar</location>
                                        </source>
                                </sources>
                        </mapping>
                        <mapping>
                                <directory>/etc/init.d/</directory>
                                <username>root</username>
                                <groupname>root</groupname>
                                <sources>
                                        <source>
                                                <destination>${project.name}</destination>
                                                <location>target/classes/initscript-template</location>
                                        </source>
                                </sources>
                        </mapping>
                </mappings>
                <preremoveScriptlet>
                        <script><![CDATA[ if [ "$1" = "0" ]; then /etc/init.d/example-batch stop; fi]]></script>
                </preremoveScriptlet>
        </configuration>
</plugin>

7. Metrics

In this Starter there are a lot of metrics that will be written during jobruns (see below).

7.1. Configuration

7.1.1. Enable Metrics in the Starter

batch.metrics.enabled=true

7.1.2. Add Codahale to your pom.xml

<dependency>
    <groupId>io.dropwizard.metrics</groupId>
    <artifactId>metrics-core</artifactId>
    <version>3.1.0</version>
</dependency>
<dependency>
    <groupId>net.alchim31</groupId>
    <artifactId>metrics-influxdb</artifactId>
    <version>0.5.0</version>
</dependency>

7.1.3. Enable e.g. InfluxDB Reporter

batch.metrics.export.influxdb.enabled=true
batch.metrics.export.influxdb.server=localhost
batch.metrics.export.influxdb.port=8086
batch.metrics.export.influxdb.db=db1
batch.metrics.export.influxdb.username=user
batch.metrics.export.influxdb.password=password

7.1.4. Setup InfluxDB with Grafana

  • Run it with Docker

docker run -d -p 8083:8083 -p 8086:8086 -e PRE_CREATE_DB="db1" tutum/influxdb:latest
docker run -d -name grafana -p 8080:80 -e INFLUXDB_HOST=192.168.59.103 -e INFLUXDB_PORT=8086 -e INFLUXDB_NAME=db1 -e INFLUXDB_USER=root -e INFLUXDB_PASS=root tutum/grafana
  • Retrieve some metrics per HTTP

curl -G 'http://localhost:8080/db/db1/series?u=root&p=root' --data-urlencode "q=select * from <hostname>.counter.batch.simpleBatchMetricsJob.count"
  • References

7.2. Gauges

7.2.1. Job duration for a specific job name

timer.batch.simpleJob.duration

7.2.2. Step duration for a specific job name

timer.batch.simpleJob.step.simpleStep.duration

7.2.3. Chunk duration/count for a specific job name

timer.batch.simpleJob.step.simpleStep.chunk.duration
gauge.batch.simpleJob.step.simpleStep.chunk.count

7.2.4. Item duration/count for a specific job name

timer.batch.simpleJob.step.simpleStep.item.duration
gauge.batch.simpleJob.step.simpleStep.item.count

7.2.5. Read/Process/Write methods

timer.batch.simpleJob.step.simpleStep.DummyItemReader.read.duration

7.2.6. Custom

timer.batch.simpleJob.step.simpleStep.ExampleService.callExternalRemoteService.duration
gauge.batch.simpleJob.step.simpleStep.businesscounter

7.3. Notes

".count" is added automatically when using a Counter with GraphiteReporter ".value" is added automatically when using a Gauge with InfluxDBReporter All metrics will be prefixed with the Hostname or an Environment identifier (dev, test, qa, prod)