1. Overview
Most of the developers are creating Microservices these days and deploying to Cloud Platforms. Pivotal Cloud Foundry (PCF) is one of the known cloud platforms. When we talk about deploying applications on PCF, mostly they would be long-running processes which never ends like Web Applications, SPA, REST-based services. PCF monitors all these long-running instances and if one goes down, it spins up the new instance to replace the failed one. This works fine where the process is expected to run continuously but for a batch process, its an overkill. The container will be running all the time with no CPU usage and add up to the cost. Many developers have an ambiguity that PCF cannot run a batch application which can just be initiated based on a request. But that is not correct.
Spring Batch enables to create a batch application and provide many out of the box features to reduce the boilerplate code. Recently, Spring cloud Task has been added in the list of projects to create short-running processes. With both of these options, we can create microservices, deploy on PCF and then stop them so that PCF doesn't try to self heal them. And with the help of PCF Scheduler, we can schedule the task to run them at a certain time of the day. Let's see in this article how we can do that with very few steps.
Spring Batch enables to create a batch application and provide many out of the box features to reduce the boilerplate code. Recently, Spring cloud Task has been added in the list of projects to create short-running processes. With both of these options, we can create microservices, deploy on PCF and then stop them so that PCF doesn't try to self heal them. And with the help of PCF Scheduler, we can schedule the task to run them at a certain time of the day. Let's see in this article how we can do that with very few steps.
2. Pre-requisite
- JDK 1.8
- Spring Boot knowledge
- Gradle
- IDE (Eclipse, VSC, etc...)
- PCF instance
3. Develop the Spring Batch Application
Let's develop a small Spring batch application (spring-batch-master) which will read a file with employee data and then add department name to each of the employee records.
3.1 BatchConfiguration
Let's start with BatchConfiguration file. I have added 2 Jobs here and both show a different way of implementing the batch process. The first one is using Spring Batch Chunks and configured to set up the Job flow with steps. Each step will have reader, processor, and writer configured. The second one is using Tasklet:
public class BatchConfiguration {
public JobLauncher jobLauncher(JobRepository jobRepo) {
SimpleJobLauncher simpleJobLauncher = new SimpleJobLauncher();
simpleJobLauncher.setJobRepository(jobRepo);
return simpleJobLauncher;
}
public Job departmentProcessingJob() {
return jobBuilderFactory.get("departmentProcessingJob")
.flow(step1())
.end()
.build();
}
public Step step1() {
return stepBuilderFactory.get("step1")
.<Employee, Employee>chunk(1)
.reader(reader())
.processor(processor())
.writer(writer())
.build();
}
public Job job2() {
return this.jobBuilderFactory.get("job2")
.start(this.stepBuilderFactory.get("job2step1")
.tasklet(new Tasklet() {
public RepeatStatus execute
(StepContribution contribution, ChunkContext chunkContext)
throws Exception {
logger.info("Job2 was run");
return RepeatStatus.FINISHED;
}
})
.build())
.build();
}
}
Let's discuss first job departmentProcessingJob in little more detail. This job has step1 which
does read the file, process it and then print it.
3.2 DepartmentReader
This code has logic to read the employee data from a file as part of the first step:
public class DepartmentReader {
public FlatFileItemReader<Employee> reader() {
FlatFileItemReader<Employee> reader = new FlatFileItemReader<Employee>();
reader.setResource(new ClassPathResource("employee_data.txt"));
reader.setLineMapper(new DefaultLineMapper<Employee>() {{
setLineTokenizer(new DelimitedLineTokenizer() {{
setNames(new String[]{"id", "employeenumber", "salary"});
}});
setFieldSetMapper(new BeanWrapperFieldSetMapper<Employee>() {{
setTargetType(Employee.class);
}});
}});
return reader;
}
}
3.3 DepartmentProcessor
This code has logic to add Department Name to each employee record; based on some condition:
public class DepartmentProcessor implements ItemProcessor<Employee, Employee> {
public Employee process(Employee item) throws Exception {
if ("1001".equalsIgnoreCase(item.getEmployeeNumber())) {
item.setDepartment("Sales");
} else if ("1002".equalsIgnoreCase(item.getEmployeeNumber())) {
item.setDepartment("IT");
} else {
item.setDepartment("Staff");
}
System.out.println("Employee Details --> " + item.toString());
return item;
}
}
3.4 DepartmentWriter
This code has logic to print the Employee records with appended Department name:
public class DepartmentWriter implements ItemWriter<Employee> {
public void write(List<? extends Employee> items) throws Exception {
List<String> employeeList = new ArrayList<>();
items.forEach(item -> {
String enrichedTxn = String.join(",", item.getId(),
item.getEmployeeNumber(), item.getSalary(),
item.getDepartment());
employeeList.add(enrichedTxn);
});
employeeList.forEach(System.out::println);
}
}
3.5 BatchCommandLineRunner
Now, lets bootstrap the application logic to run as a Batch process. Spring provides CommandLineRunner to enable it:
public class BatchCommandLineRunner implements CommandLineRunner {
JobLauncher jobLauncher;
Job departmentProcessingJob;4.
public void run(String... args) throws Exception {
JobParameters param = new JobParametersBuilder()
.addString("JobID", String.valueOf(System.currentTimeMillis()))
.toJobParameters(); jobLauncher.run(departmentProcessingJob, param);
jobLauncher.run(job2, param);
}
}
4. Deploy Application on PCF
So far we have created a Spring Batch Job. Now let's deploy to PCF.
You just need to package the code and cf push to PCF with a manifest file.
manifest.yaml:
---
applications:
- name: batch-example
memory: 1G
random-route: true
path: build/libs/spring-batch-master-0.0.1-SNAPSHOT.jar
no-hostname: true
no-route: true
health-check-type: none
We will observe that application did start and executed the logic then exited. The application will be shown as crashed in PCF Apps Manager.
In PCF, all the applications run with process type: web.
It expects the application to be running all the time on some web port. However, for Batch application, it is not the case. So let's see how to handle that:
- Stop the application manually.
- Run this Job either manually with cf run-task command or schedule it using PCF Scheduler
5. Start Batch Application with CF CLI
To run the Spring Batch (Boot) application on PCF, we need to run the following command:
cf run-task <APP Name>
".java-buildpack/open_jdk_jre/bin/java org.springframework.boot.loader.JarLauncher"
Now, you can use this command in a Bamboo/Jenkins pipeline to trigger the application with a cron job.
6. Schedule Batch Job with PCF Scheduler
To schedule the Batch Job, we can use PCF Scheduler. PCF Scheduler enables you to create tasks and schedule them using cron expression. We can go to the application in Apps Manager -> Tasks and click on Enable Scheduling to bind the application with PCF Scheduler. Now you can create Job as shown in below picture. For more details on how to use PCF Scheduler, you can read this blog.
7. Batch Application with Spring Cloud Task
Spring has come up with a new project called Spring Cloud Task (SCT). Its purpose is to create short-lived microservices on Cloud platforms. We just need to add a @EnableTask annotation. This will register TaskRepository and creates the TaskExecution which will pick up the Job defined and execute them one by one.
TaskRepository by default uses in-memory DB however, it can support most of the persistence DBs like Oracle, MySQL, PostgreSQL, etc..
So let's develop one more small application (sct-batch-job) with SCT.
Put @EnableTask in Spring Boot Main Class:
public class EmployeeProcessingBatch {
public static void main(String[] args) {
SpringApplication.run(EmployeeProcessingBatch.class, args);
}
}
Add two Jobs in SCTJobConfiguration file:
public class SCTJobConfiguration {
private static final Log logger = LogFactory.getLog(SCTJobConfiguration.class);
public JobBuilderFactory jobBuilderFactory;
public StepBuilderFactory stepBuilderFactory;
public Job job1() {
return this.jobBuilderFactory.get("job1")
.start(this.stepBuilderFactory.get("job1step1")
.tasklet(new Tasklet() {
public RepeatStatus execute
(StepContribution contribution, ChunkContext chunkContext)
throws Exception {
logger.info("Job1 ran successfully");
return RepeatStatus.FINISHED;
}
})
.build())
.build();
}
public Job job2() {
return this.jobBuilderFactory.get("job2")
.start(this.stepBuilderFactory.get("job2step1")
.tasklet(new Tasklet() {
public RepeatStatus execute
(StepContribution contribution, ChunkContext chunkContext)
throws Exception {
logger.info("Job2 ran successfully");
return RepeatStatus.FINISHED;
}
})
.build())
.build();
}
}
That's it. So if you notice, I have used @EnableBatchProcessing to integrate SCT with Spring Batch. That way we can run Spring Batch application as a Task. We can now push this application to PCF as we did the earlier one and then either run it manually or schedule it using PCF Scheduler.
8. NodeJS Batch Job Scheduling with PCF Scheduler
Similarly, if we have a Batch Job running with NodeJS, we can first stop the application and then create the task with a command to start the nodejs application.
9. Conclusion
In this article, we have talked about how a Spring Batch application can be run on PCF. We can also use Spring Cloud Task to run a short-lived microservices. Spring Cloud Task also provides integration with Spring Batch so you can use full benefits of Batch as well as Spring Cloud Task.
No comments: