1. What is spring-boot-starter-batch-web?
Java batch is becoming a hot topic in enterprise environments these days, but how do you do it the right way? The project spring-boot-starter-batch-web offers a best practice approach to modern batch architectures, answering the following questions:
-
How do I actually deploy jobs?
-
How do I start, stop and monitor them?
-
How do I integrate them into my companies' infrastructure?
-
How do I build job artifacts?
It builds on Spring Batch, Spring Boot and JSR-352.
You may want to switch to the Getting Started page. If you want to know about the possibilities for configuring the application, switch to Configuration options.
2. Blog roll
Most information about this project will be covered in blog posts. You find them here:
3. Getting Started
-
Create a Spring Boot maven project. Take a look at this example, it’s pretty much just using spring-boot-starter-parent as a parent und adding the Spring Boot build plugin. Then add the dependency to spring-boot-starter-batch-web like in the example. Check for the current version in the Release Notes.
<dependency>
<groupId>de.codecentric</groupId>
<artifactId>spring-boot-starter-batch-web</artifactId>
<version>1.3.4.RELEASE</version>
</dependency>
-
If you have a database with the Spring Batch meta data tables and your business data, add the connection properties to the application.properties like in this example. If you don’t specify these properties you’ll get an in-memory database for the Spring Batch meta data tables.
-
Add a simple logback.xml for logging. Here’s an example inheriting from our basic log configuration to support log file separation.
-
Add a batch job. You may define it in XML and put it into META-INF/spring/batch/jobs (overridable via property batch.config.path.xml) or in JavaConfig and put it into the package spring.batch.jobs (overridable via property batch.config.package.javaconfig). Each XML file or class annotated with @Configuration in the specified locations will get its own child ApplicationContext. Third option is defining a JSR-352 style job in XML and adding it to META-INF/batch-jobs.
-
Add an entry point to the application, a class with a main method invoking SpringApplication.run(…). Take a look at this example.
-
Build the application via maven package. Then start the application using java -jar xxx.jar.
Default port is 8080. Take a look at the JavaDoc of these controllers to get to know the endpoints for starting jobs etc.: JobOperationsController and JobMonitoringController.
4. Configuration options
There are two ways to influence spring-boot-starter-batch-web’s behaviour. The first way is to set certain properties, the second way is to add certain components to the ApplicationContext. Let’s take a look at the available properties first. Note that these are only the properties of spring-boot-starter-batch-web, there are more properties from Spring Boot described in the Spring Boot reference documentation.
Property name | Description | Default value |
---|---|---|
batch.config.path.xml |
Location in the classpath where Spring Batch job definitions in XML are picked up. |
/META-INF/spring/batch/jobs |
batch.config. package.javaconfig |
Package where Spring Batch job definitions in JavaConfig are picked up. |
spring.batch.jobs |
batch.defaultprotocol. enabled |
Whether the default job protocol printed into the log is activated. |
true |
batch.logfileseparation. enabled |
Whether writing one log file for each job execution is activated. |
true |
batch.metrics.enabled |
Whether the transaction safe batch metrics framework is activated so that BatchMetrics may be injected and used. |
false |
batch.metrics.profiling. readprocesswrite.enabled |
Readers, Processors and Writers are profiled with RichGauges when set to true. |
false |
batch.core.pool.size |
Core pool size of the thread pool. |
5 |
batch.queue.capacity |
Queue capacity of the task executor. |
Integer.MAX_VALUE |
batch.max.pool.size |
Max pool size of the thread pool. |
Integer.MAX_VALUE |
batch.repository. isolationlevelforcreate |
Database isolation level for creating job executions. |
Spring Batch’s default |
batch.repository. tableprefix |
Prefix for Spring Batch meta data tables. |
Not set |
batch.web.operations.base |
Base URL for the operations endpoint. |
/batch/operations |
batch.web.monitoring.base |
Base URL for the monitoring endpoint. |
/batch/monitoring |
Now let’s take a look at the interfaces / abstract classes you may implement to add those implemented components to the ApplicationContext:
Example:
private static class SimpleMetricsOutputFormatter implements MetricsOutputFormatter {
@Override
public String format(List<RichGauge> gauges, List<Metric<?>> metrics) {
StringBuilder builder = new StringBuilder("\n########## Metrics Start ##########\n");
if (gauges != null) {
for (RichGauge gauge : gauges) {
builder.append(gauge.toString() + "\n");
}
}
if (metrics != null) {
for (Metric<?> metric : metrics) {
builder.append(metric.toString() + "\n");
}
}
builder.append("########## Metrics End ############");
return builder.toString();
}
}
5. Initscript Template
This is a simple example for an initscript template to control your Spring Boot batch application:
#!/bin/sh
#
# Simple initscript for a Java application
#
# Verifies the status of the application
verify_status() {
[...]
}
# Starts the application
start() {
# First check if the application is already started
verify_status
# Start JVM
java -jar /opt/batch/example-batch/example-batch.jar >> /var/log/example-batch.log 2>&1 &
}
# Stopps the application
stop() {
[...]
}
# Shows the application status on the console
status() {
[...]
}
case "$1" in
start)
start
;;
stop)
stop
;;
status)
status
;;
restart)
stop
start
;;
*)
echo "Usage: $0 {start|stop|restart|status}"
exit 1
esac
exit 0
6. RPM
The following code snippet may help you to configure the Maven RPM Plugin, which creates an RPM during the build process for you. In this example the RPM will be created as an attached artifact. For more information see the documentation of the plugin.
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>rpm-maven-plugin</artifactId>
<version>2.1-alpha-4</version>
<extensions>true</extensions>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>attached-rpm</goal>
</goals>
</execution>
</executions>
<configuration>
<name>${project.name}</name>
<version>${project.version}</version>
<copyright>2014,codecentric</copyright>
<distribution>sles11</distribution>
<group>Application/codecentric</group>
<packager>jenkins</packager>
<changelogFile>src/changelog</changelogFile>
<vendor>sles</vendor>
<defineStatements>
<defineStatement>_unpackaged_files_terminate_build 0</defineStatement>
</defineStatements>
<mappings>
<mapping>
<directory>/opt/batch/${project.name}/</directory>
<filemode>755</filemode>
<username>batch</username>
<groupname>batch</groupname>
<sources>
<source>
<destination>${project.name}.jar</destination>
<location>target/${project.artifactId}-${project.version}.jar</location>
</source>
</sources>
</mapping>
<mapping>
<directory>/etc/init.d/</directory>
<username>root</username>
<groupname>root</groupname>
<sources>
<source>
<destination>${project.name}</destination>
<location>target/classes/initscript-template</location>
</source>
</sources>
</mapping>
</mappings>
<preremoveScriptlet>
<script><![CDATA[ if [ "$1" = "0" ]; then /etc/init.d/example-batch stop; fi]]></script>
</preremoveScriptlet>
</configuration>
</plugin>
7. Metrics
In this Starter there are a lot of metrics that will be written during jobruns (see below).
7.1. Configuration
7.1.1. Enable Metrics in the Starter
batch.metrics.enabled=true
7.1.2. Add Codahale to your pom.xml
<dependency>
<groupId>io.dropwizard.metrics</groupId>
<artifactId>metrics-core</artifactId>
<version>3.1.0</version>
</dependency>
<dependency>
<groupId>net.alchim31</groupId>
<artifactId>metrics-influxdb</artifactId>
<version>0.5.0</version>
</dependency>
7.1.3. Enable e.g. InfluxDB Reporter
batch.metrics.export.influxdb.enabled=true
batch.metrics.export.influxdb.server=localhost
batch.metrics.export.influxdb.port=8086
batch.metrics.export.influxdb.db=db1
batch.metrics.export.influxdb.username=user
batch.metrics.export.influxdb.password=password
7.1.4. Setup InfluxDB with Grafana
-
Run it with Docker
docker run -d -p 8083:8083 -p 8086:8086 -e PRE_CREATE_DB="db1" tutum/influxdb:latest
docker run -d -name grafana -p 8080:80 -e INFLUXDB_HOST=192.168.59.103 -e INFLUXDB_PORT=8086 -e INFLUXDB_NAME=db1 -e INFLUXDB_USER=root -e INFLUXDB_PASS=root tutum/grafana
-
Retrieve some metrics per HTTP
curl -G 'http://localhost:8080/db/db1/series?u=root&p=root' --data-urlencode "q=select * from <hostname>.counter.batch.simpleBatchMetricsJob.count"
-
References
7.2. Gauges
7.2.1. Job duration for a specific job name
timer.batch.simpleJob.duration
7.2.2. Step duration for a specific job name
timer.batch.simpleJob.step.simpleStep.duration
7.2.3. Chunk duration/count for a specific job name
timer.batch.simpleJob.step.simpleStep.chunk.duration gauge.batch.simpleJob.step.simpleStep.chunk.count
7.2.4. Item duration/count for a specific job name
timer.batch.simpleJob.step.simpleStep.item.duration gauge.batch.simpleJob.step.simpleStep.item.count
7.2.5. Read/Process/Write methods
timer.batch.simpleJob.step.simpleStep.DummyItemReader.read.duration
7.2.6. Custom
timer.batch.simpleJob.step.simpleStep.ExampleService.callExternalRemoteService.duration gauge.batch.simpleJob.step.simpleStep.businesscounter
7.3. Notes
".count" is added automatically when using a Counter with GraphiteReporter ".value" is added automatically when using a Gauge with InfluxDBReporter All metrics will be prefixed with the Hostname or an Environment identifier (dev, test, qa, prod)