Spring Batch is Java framework that makes it easy to write batch applications. Batch applications involve reliably and efficiently processing large volumes of data to and from various data sources (files, databases, and so on). Spring Batch is great at doing this and provides the necessary foundation to meet the stringent requirements of batch applications. Sir Isaac Newton said, “If I have seen further it is only by standing on the shoulders of giants.” Spring Batch builds on the shoulders of one giant in particular: the Spring Framework. Spring is the framework of choice for a significant segment of the Enterprise Java development market. Spring Batch makes the Spring programming model—based on simplicity and efficiency—easier to apply for batch applications. Spring Batch leverages all the well-worn Spring techniques and components, like dependency injection, data access support, and transaction management.
Batch processing is a large topic and Spring Batch has a wide range of features. We don’t claim this book to be exhaustive. Instead, we provide the reader with the most useful information, based on our own experience with real-world projects, feedback from Spring Batch users, and…our own mistakes! The excellent reference documentation1 of the framework should be a useful complement to this book. We obviously focus on Spring Batch, but we also cover different, yet related, topics like schedulers. Batch jobs aren’t islands, they’re integrated in the middle of complex systems, and we cover this aspect too. That’s why chapter 11 discusses how Spring Batch can cohabit with technologies like REST and Spring Integration. Again, we want to stick as close as possible to the reality of batch systems, and this is (one part of) our vision.
We use the latest release of the latest branch of the framework available at the time of this writing, Spring Batch 2.1.
Because this is an In Action book, we provide code and configuration examples throughout, both to illustrate the concepts and to provide a template for successful operation.
Our primary target audience for this book is Java developers and architects who want to write batch applications. Experience with Spring is a plus, but not a requirement. We strive to give the necessary pointers and reminders in dedicated sidebars. Read this book even if you don’t know Spring—you can grab a copy of Manning’s Spring in Action, Third Edition, by Craig Walls to discover this wonderful technology. For those familiar with Spring, basic knowledge of dependency injection, data access support, and transaction management is enough. With this Spring background and this book, you’ll be Spring Batch-ing in a matter of minutes.
What if you don’t know Java and want to write batch applications? Well, think about learning Java to make your batch writing life easier. Spring Batch is great for batch applications!
The book is divided into three parts. The first part introduces the challenges presented by batch applications and how to use Spring Batch to addresses them. The second part forms the core of the presentation of the Spring Batch feature set. It exhaustively covers all of the scenarios you’ll meet writing real-life batch applications. The third and final part covers advanced topics, including monitoring, scaling, and testing. We also include appendixes covering the installation of a typical development environment for Spring Batch and the configuration of the Spring Batch Admin web-based administration console.
Chapter 1 discusses batch applications and gives an overview of Spring Batch features. It also introduces Spring Batch using a hands-on approach, based on a real-world use case. It’s a great place to start if you want to discover how to implement common batch scenarios with Spring Batch.
Chapter 2 covers the way Spring Batch structures the world of batch jobs. We name and define each batch applications concept, using the domain language of batch applications. With a term for each concept forming the vocabulary of batch jobs, you’ll be able to communicate clearly and easily with your team about your batch applications.
Chapter 3 covers the configuration of Spring Batch jobs. It explains in detail all the XML elements and annotations available to configure every aspect of your jobs.
Chapter 4 discusses launching batch jobs under different scenarios: from the command line, using a scheduler like cron, or from an HTTP request. It also covers how to stop a job properly.
Chapter 5 covers reading data efficiently from different sources, using Spring Batch components.
Chapter 6 is the mirror image of chapter 5 where we cover writing to various data targets. It lists all the available components to write to databases and files, send emails, and so on.
Chapter 7 discusses an optional step between reading and writing: processing. This is where you can embed business logic to transform or filter items.
Chapter 8 covers the Spring Batch built-in features that make jobs more robust: skipping incorrectly formatted lines from a flat file by using a couple of XML lines in your configuration, retrying operations transparently after a transient failure, and restarting a job exactly where it left off.
Chapter 9 discusses the tricky topic of transactions. It explains how Spring Batch handles transactions, the how, when, and why of tweaking transactions, and useful transaction management patterns for batch applications.
Chapter 10 covers the way Spring Batch handles the flow of steps inside a job: linear versus nonlinear flows, sharing data between steps of a job, and interacting with the execution context.
Chapter 11 explores how a Spring Batch job can end up being in the middle of a complex enterprise integration application. In this chapter, you’ll see how Spring Batch, Spring Integration, and Spring REST cohabit happily to meet real-world enterprise integration scenarios.
Chapter 12 discusses the monitoring of Spring Batch jobs. Because Spring Batch maintains execution metadata, this chapter covers how—JMX, web application—to access this metadata to query the state of your jobs.
Chapter 13 tackles the complex topic of scaling. It covers the different strategies Spring Batch provides to parallelize the execution of your jobs on multiple threads or even multiple physical nodes.
Chapter 14 is about testing Spring Batch jobs. Unit testing isolated components and testing a whole job execution are covered.
We’ve licensed the source code for the example applications in this book under the Apache Software Foundation License, version 2.0. This source code is available at http://code.google.com/p/springbatch-in-action/ and is freely available from Manning’s website at www.manning.com/SpringBatchinAction.
Much of the source code shown in this book consists of fragments designed to illustrate the text. When a complete segment of code is presented, it appears as a numbered listing; code annotations accompany some of the listings where further explanations of the code are needed. When we present source code, we sometimes use a bold font to draw attention to specific elements.
In the text, we use
typeface to denote code (Java and XML) as well as Java methods, XML element names, and other source code identifiers:
The purchase of Spring Batch in Action includes free access to a private web forum run by Manning Publications where you can make comments about the book, ask technical questions, and receive help from the authors and from other users. To access the forum and subscribe to it, point your web browser to www.manning.com/SpringBatchinAction. This page provides information on registering, getting on the forum, the kind of help available, and the rules of conduct on the forum.
Manning’s commitment to our readers is to provide a venue where a meaningful dialogue between individual readers and between readers and the authors can take place. It’s not a commitment to any specific amount of participation on the part of the authors, whose contribution to the Author Online forum remains voluntary (and unpaid). We suggest you try asking them some challenging questions lest their interest stray! The Author Online forum and the archives of previous discussions will be accessible from the publisher’s website as long as the book is in print.
Arnaud Cogoluègnes is a software developer, Java EE architect, and author with deep expertise in middleware, software engineering, and Spring technologies. Arnaud spent a number of years developing complex business applications and integrating Java-based products. A SpringSource certified trainer, Arnaud has trained hundreds of people around the world on Spring technologies and the Java platform.
Thierry Templier is a Java EE, Web2, and modeling architect and expert with more than 10 years of experience. He’s a Spring addict and enthusiast and enjoys implementing any kind of applications and tools using it. He is also the coauthor of some French books on these subjects and Spring Dynamic Modules in Action. He recently joined Noelios Technologies, the company behind the Restlet framework, and lives in Brittany (France).
Gary Gregory is the coauthor of JUnit in Action, Second Edition. He has more than 20 years of experience in building object-oriented systems, C/C++, Smalltalk, Java, and the whole soup of XML and database technologies. Gary has held positions at Ashton-Tate, ParcPlace-Digitalk, and several other software companies, including Seagull Software, where he currently develops application servers for legacy integration. He’s an active member of the Apache Software Foundation and the Apache Commons Project Management Committee, and contributes regularly to various Apache Commons projects. Born and raised in Paris, France, Gary received a BA in Linguistics and Computer Science from the University of California at Los Angeles. He lives in Florida with his wife, their son, and assorted golf clubs. You can find him at http://www.garygregory.com.
Olivier Bazoud is a software architect at Ekino, the IT branch of FullSIX Group. He’s also a Spring technologies expert. With over 12 years of experience, he develops complex business applications and high-traffic websites based on Java and web technologies.