Monday, January 27, 2020

What is Lambda Function and Destinations


AWS Lambda function is a very well-known service that builds serverless applications. Since its inception in 2014, it has been adopted very well by developers and architects.

AWS serverless application comprises several components. However, Function is the backbone of it. Other AWS services interact with each other through Function, completing the serverless workflow.


1. Event Source: - Lambda Function doesn’t have a REST API endpoint. Hence, it cannot be
called directly from outside clients. It needs event triggering to get invoked. An event can be:

- Data record change of a DynamoDB table
- Request to a REST endpoint through API Gateway
- Change in the state of resource e.g. file uploaded to the S3 Bucket

It can also be triggered through CloudWatch Events. As Lambda has been evolving, it’s constantly adding many other types of events.



2. Function: - All the functional logic goes here. It does support many languages, including but not limited to Java, Go, Node, Python. The event source triggers it and then connects to all
other AWS Services to create a serverless pattern.



3. Services: - It can be either an AWS service or an external service. These services would be
required either to persist or retrieve the data from other systems. Some of the services like
SQS, SNS complements serverless flow with event-driven architecture.

In Nov 2019 reinvent, AWS announced a new feature in Lambda Function known as Lambda
Destinations. With Destinations, you can route asynchronous function results as an execution
record to a destination resource without writing additional code. An execution record contains
details about the request and response in JSON format including version, timestamp, request
context, request payload, response context, and response payload. For each execution status
such as Success or Failure, you can choose one of four destinations: another Lambda function, SNS, SQS, or EventBridge. Lambda can also be configured to route different execution results to different destinations.

Asynchronous Function Execution Result




What is Step Functions 

AWS Step Functions is an orchestrator that helps to design and implement complex workflows. When we need to build a workflow or have multiple tasks that need orchestration, Step Functions coordinates between those tasks. This makes it simple to build multi-step systems.

Step Functions is built on two main concepts Tasks and State Machine.

All work in the state machine is done by tasks. A task performs work by using an activity or an AWS Lambda function or passing parameters to the API actions of other services.

A state machine is defined using the JSON-based Amazon States Language. When an AWS Step Functions state machine is created, it stitches the components together and shows the developers their system and how it is being configured. Have a look at a simple example:





Lambda Destinations vs Step Functions


Topic
Lambda Destination
Step Functions
Sync vs Async Invocation
It supports only Async and Stream invocation
It supports both sync and async flows. In either flow, need to call StartExecution API.
Services Supported
For async, It supports only 4 destinations services Lambda, SQS, SNS, and EventBridge (as of Jan 17, 2020)

For Stream invocation, It supports only SQS and
SNS
It supports many more services other than the 4 like AWS Fargate, SageMaker, ECS, Non-AWS workload and many others.
Branching -Parallel Processing
It doesn't support parallel branching of execution
It can support dynamic parallelism, so you can optimize the performance and efficiency of application workflows such as data processing and task automation. By running identical tasks in parallel, you can achieve consistent execution durations and improve the utilization of resources to save on operating costs. Step Functions automatically scales resources in response to your input. 
Branching - Choice
For Async invocation, It supports only success and failure conditions.

For Stream invocation, it supports only the failure condition.
It provides the feature to put any type of choice :
"Choices": [
{
"Variable": "$.foo",
"
NumericEquals": 1,
"Next": "
FirstMatchState"
},
{
"Variable": "$.foo",
"
NumericEquals": 2,
"Next": "
SecondMatchState"
}
]
Retry
It does support Lambda function to retry for Asynchronous invocation before calling to Lambda Destinations. But it doesn't give feature to configure of BackoffRate
It has a Retry policy to handle Lambda failures:
"Retry": [
{
"
ErrorEquals": ["CustomError"],
"
IntervalSeconds": 1,
"
MaxAttempts": 2,
"
BackoffRate": 2.0
},


Conclusion

To summarize, both the services can co-exist as both have features that can be used in different scenarios. I would recommend starting with Lambda destinations service and see if that suffice your purpose as it will be much cheaper and requires less maintenance for small use cases. If it doesn't fulfill your need, you may go for Step functions.

Wednesday, January 1, 2020

Overview


Spring Cloud Hystrix Project was built as a wrapper on top of the Netflix Hystrix library. Since then, It has been adopted by many enterprises and developers to implement the Circuit Breaker pattern.
In November 2018 when Netflix announced that they are putting this project into maintenance mode, it prompted Spring Cloud to announce the same. Since then, no further enhancements are happening in this Netflix library. In SpringOne 2019, Spring announced that Hystrix Dashboard will be removed from Spring Cloud 3.1 version which makes it officially dead.
As the Circuit Breaker pattern has been advertised so heavily, many developers have either used it or want to use it, and now need a replacement. Resilience4j has been introduced to fulfill this gap and provide a migration path for Hystrix users.

Resilience4j

Resilience4j has been inspired by Netflix Hystrix but is designed for Java 8 and functional programming. It is lightweight compared to Hystrix as it has the Vavr library as its only dependency. Netflix Hystrix, by contrast, has a dependency on Archaius which has several other external library dependencies such as Guava and Apache Commons.
A new library always has one advantage over a previous library - it can learn from the mistakes of its predecessor. Resilience4j also comes with many new features:

CircuitBreaker

When a service invokes another service, there is always a possibility that it may be down or having high latency. This may lead to exhaustion of the threads as they might be waiting for other requests to complete. The CircuitBreaker pattern functions in a similar fashion to an electrical Circuit Breaker:
  • When a number of consecutive failures cross the defined threshold, the Circuit Breaker trips.
  • For the duration of the timeout period, all requests invoking the remote service will fail immediately.
  • After the timeout expires the Circuit Breaker allows a limited number of test requests to pass through.
  • If those requests succeed the Circuit Breaker resumes normal operation.
  • Otherwise, if there is a failure the timeout period begins again.

RateLimiter

Rate Limiting pattern ensures that a service accepts only a defined maximum number of requests during a window. This ensures that underline resources are used as per their limits and don't exhaust.

Retry

Retry pattern enables an application to handle transient failures while calling to external services. It ensures retrying operations on external resources a set number of times. If it doesn't succeed after all the retry attempts, it should fail and response should be handled gracefully by the application.

Bulkhead

Bulkhead ensures the failure in one part of the system doesn't cause the whole system down. It controls the number of concurrent calls a component can take. This way, the number of resources waiting for the response from that component is limited. There are two types of bulkhead implementation:
  • The semaphore isolation approach limits the number of concurrent requests to the service. It rejects requests immediately once the limit is hit.
  • The thread pool isolation approach uses a thread pool to separate the service from the caller and contain it to a subset of system resources.
The thread pool approach also provides a waiting queue, rejecting requests only when both the pool and queue are full. Thread pool management adds some overhead, which slightly reduces performance compared to using a semaphore, but allows hanging threads to time out.             

Build a Spring Boot Application with Resilience4j

In this article, we will build 2 services - Book Management and Library Management.
In this system, Library Management calls Book Management. We will need to bring Book Management service up and down to simulate different scenarios for the CircuitBreaker, RateLimit, Retry and Bulkhead features.

Prerequisites

  1. JDK 8
  2. Spring Boot 2.1.x
  3. resilience4j 1.1.x (latest version of resilience4j is 1.3 but resilience4j-spring-boot2 has latest version 1.1.x only)
  4. IDE like Eclipse, VSC or intelliJ (prefer to have VSC as it is very lightweight. I like it more compared to Eclipse and intelliJ)
  5. Gradle
  6. NewRelic APM tool ( you can use Prometheus with Grafana also)

Book Management service
  1. Gradle Dependency   
This service is a simple REST-based API and needs standard spring-boot starter jars for web and test dependencies. We will also enable swagger to test the API:
dependencies {
    //REST
    implementation 'org.springframework.boot:spring-boot-starter-web'
    //swagger
    compile group: 'io.springfox', name: 'springfox-swagger2', version: '2.9.2'
    implementation group: 'io.springfox', name: 'springfox-swagger-ui', version: '2.9.2'
    testImplementation 'org.springframework.boot:spring-boot-starter-test'
}
     2. Configuration
The configuration has only a single port as detailed configuration:
server:
    port: 8083

  1. Service Implementation
It has two methods addBook and retrieveBookList. Just for demo purposes, we are using an ArrayList object to store the book information:
@Service
public class BookServiceImpl implements BookService {

    List<Book> bookList = new ArrayList<>();

    @Override
    public String addBook(Book book) {
        String message  =   "";
        boolean status  =   bookList.add(book);
        if(status){
            message=    "Book is added successfully to the library.";
        }
        else{
             message=    "Book could not be added in library due to some technical issue. Please try later!";
        }
        return message;
    }

    @Override
    public List<Book> retrieveBookList() {
        return bookList;
    }
}

  1. Controller
Rest Controller has exposed two APIs - one is POST for adding book and the other is GET for retrieving book details:
@RestController
@RequestMapping("/books")
public class BookController {

    @Autowired
    private BookService bookService  ;

    @PostMapping
    public String addBook(@RequestBody Book book){
        return bookService.addBook(book);
    }

    @GetMapping
    public List<Book> retrieveBookList(){
        return bookService.retrieveBookList();
    }
}
  1. Test Book Management Service
Build and start the application by using the below commands:
//build application
gradlew build

//start application
java -jar build/libs/bookmanangement-0.0.1-SNAPSHOT.jar

//endpoint url
http://localhost:8083/books
Now we can test the application using Swagger UI - http://localhost:8083/swagger-ui.html
Ensure the service is up and running before moving to build the Library Management service.

Library Management service

In this service, we will be enabling all of the Resilience4j features.
  1. Gradle Dependency   
This service is also a simple REST-based API and also needs standard spring-boot starter jars for web and test dependencies. To enable CircuitBreaker and other resilience4j features in the API, we have added a couple of other dependencies like - resilience4j-spring-boot2, spring-boot-starter-actuator, spring-boot-starter-aop. We need to also add micrometer dependencies (micrometer-registry-prometheus, micrometer-registry-new-relic) to enable the metrics for monitoring. And lastly, we enable swagger to test the API:
dependencies {
    
    compile 'org.springframework.boot:spring-boot-starter-web'
    
    //resilience
    compile "io.github.resilience4j:resilience4j-spring-boot2:${resilience4jVersion}"
    compile 'org.springframework.boot:spring-boot-starter-actuator'
    compile('org.springframework.boot:spring-boot-starter-aop')
 
    //swagger
    compile group: 'io.springfox', name: 'springfox-swagger2', version: '2.9.2'
    implementation group: 'io.springfox', name: 'springfox-swagger-ui', version: '2.9.2'

    // monitoring
        compile "io.micrometer:micrometer-registry-prometheus:${resilience4jVersion}"
      compile 'io.micrometer:micrometer-registry-new-relic:latest.release'

    testImplementation 'org.springframework.boot:spring-boot-starter-test'
}
  1. Configuration
Here, we need to do a couple of configurations -
-  By default CircuitBreaker and RateLimiter actuator APIs are disabled in spring 2.1.x. We need to enable them using management properties. Refer those properties in the source code link shared at the end of the article. We also need to add the following other properties:
-  Configure NewRelic Insight API key and account id
management:
   metrics:
    export:
      newrelic:
        api-key: xxxxxxxxxxxxxxxxxxxxx
        account-id: xxxxx
        step: 1m
-  Configure resilience4j CircuitBreaker properties for "add" and "get" service APIs.
 resilience4j.circuitbreaker:
  instances:
    add:
      registerHealthIndicator: true
      ringBufferSizeInClosedState: 5
      ringBufferSizeInHalfOpenState: 3
      waitDurationInOpenState: 10s
      failureRateThreshold: 50
      recordExceptions:
        - org.springframework.web.client.HttpServerErrorException
        - java.io.IOException
        - java.util.concurrent.TimeoutException
        - org.springframework.web.client.ResourceAccessException
        - org.springframework.web.client.HttpClientErrorException
      ignoreExceptions:
-  Configure resilience4j RateLimiter properties for "add" service API.
resilience4j.ratelimiter:
  instances:
    add:
      limitForPeriod: 5
      limitRefreshPeriod: 100000
          timeoutDuration: 1000ms
-  Configure resilience4j Retry properties for "get" service API.
resilience4j.retry:
  instances:
    get:
      maxRetryAttempts: 3
      waitDuration: 5000
-  Configure resilience4j Bulkhead properties for "get" service API.
resilience4j.bulkhead:
  instances:
    get:
      maxConcurrentCall: 10
      maxWaitDuration: 10ms
Now, we will be creating a LibraryConfig class to define a bean for RestTemplate to make a call to Book Management service. We have also hardcoded the endpoint URL of Book Management service here. Not a good idea for a production-like application but the purpose of this demo is only to showcase the resilience4j features. For a production app, we may want to use the service-discovery service.
@Configuration
public class LibraryConfig {
    Logger logger = LoggerFactory.getLogger(LibrarymanagementServiceImpl.class);
    private static final String baseUrl = "https://bookmanagement-service.apps.np.sdppcf.com";

    @Bean
    RestTemplate restTemplate(RestTemplateBuilder builder) {
        UriTemplateHandler uriTemplateHandler = new RootUriTemplateHandler(baseUrl);
        return builder
                .uriTemplateHandler(uriTemplateHandler)
                .build();
   }
 
}
  1. Service
Service Implementation has methods which are wrapped with @CircuitBreaker,@RateLimiter, @Retry  and @Bulkhead annotations These all annotation supports fallbackMethod attribute and redirect the call to the fallback functions in case of failures observed by each pattern. We would need to define the implementation of these fallback methods:
This method has been enabled by CircuitBreaker annotation. So if /books endpoint fails to return the response, it is going to call fallbackForaddBook() method.
    @Override
    @CircuitBreaker(name = "add", fallbackMethod = "fallbackForaddBook")
    public String addBook(Book book){
        logger.error("Inside addbook call book service. ");
        String response = restTemplate.postForObject("/books", book, String.class);
        return response;
    }
This method has been enabled by RateLimiter annotation. If the /books endpoint is going to reach the threshold defined in a configuration defined above, it will call fallbackForRatelimitBook() method.
    @Override
    @RateLimiter(name = "add", fallbackMethod = "fallbackForRatelimitBook")
    public String addBookwithRateLimit(Book book){
        String response = restTemplate.postForObject("/books", book, String.class);
        logger.error("Inside addbook, cause ");
        return response;
    }
This method has been enabled by Retry annotation. If the /books endpoint is going to reach the threshold defined in a configuration defined above, it will call fallbackRetry() method.
    @Override
    @Retry(name = "get", fallbackMethod = "fallbackRetry")
    public List<Book> getBookList(){
        return restTemplate.getForObject("/books", List.class);
    }
This method has been enabled by Bulkhead annotation. If the /books endpoint is going to reach the threshold defined in a configuration defined above, it will call fallbackBulkhead() method.
    @Override
    @Bulkhead(name = "get", type = Bulkhead.Type.SEMAPHORE, fallbackMethod = "fallbackBulkhead")
    public List<Book> getBookListBulkhead() {
        logger.error("Inside getBookList bulk head");
        try {
            Thread.sleep(100000);
        } catch (InterruptedException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        return restTemplate.getForObject("/books", List.class);
    }
Once the service layer is set up, we need to expose the corresponding REST APIs for each of the methods so that we can test them. For that, we need to create the RestController class.
  1. Controller
Rest Controller has exposed 4 APIs -
   - First one is a POST for adding a book
   - Second is again a POST for adding book but this will be used to demo Rate-Limit feature.
   - The third one is GET for retrieving book details.
   - Forth one is GET API for retrieving book details but enabled with bulkhead feature.
@RestController
@RequestMapping("/library")
public class LibrarymanagementController {

    @Autowired
    private LibrarymanagementService librarymanagementService;
    @PostMapping
    public String addBook(@RequestBody Book book){
        return librarymanagementService.addBook(book);
    }

    @PostMapping ("/ratelimit")
    public String addBookwithRateLimit(@RequestBody Book book){
        return librarymanagementService.addBookwithRateLimit(book);
    }

    @GetMapping
    public List<Book> getSellersList() {
        return librarymanagementService.getBookList();
    }
    @GetMapping ("/bulkhead")
    public List<Book> getSellersListBulkhead() {
        return librarymanagementService.getBookListBulkhead();
    }
}
Now, the code is ready. We have to build and bring it up and running.
  1. Build and Test Library Management Service
Build and start the application by using the below commands:
//Build
gradlew build

//Start the application
java -jar build/libs/librarymanangement-0.0.1-SNAPSHOT.jar

//Endpoint Url
http://localhost:8084/library
Now we can test the application using Swagger UI - http://localhost:8084/swagger-ui.html

Run Test Scenarios for CircuitBreaker RateLimiter Retry and Bulkhead

CircuitBreaker - Circuit Breaker has been applied to addBook API. To test if it's working, we will stop the Book Management service.
  • First, observe the health of the application by hitting http://localhost:8084/actuator/health URL.
  • Now stop the Book Management service and hit addBook API of Library Management service using swagger UI
At the first step, It should show the circuit breaker state as "CLOSED". This is Prometheus metrics which we enabled through the micrometer dependency.
After we execute the second step, it will start failing and redirecting to the fallback method.
Once it crosses the threshold, which in this case is 5, it will trip the circuit. And, each call after that will directly go to the fallback method without making an attempt to hit Book Management service. (You can verify this by going to logs and observe the logger statement. Now, we can observe the /health endpoint showing CircuitBreaker state as "OPEN".
{
    "status": "DOWN",
    "details": {
        "circuitBreakers": {
            "status": "DOWN",
            "details": {
                "add": {
                    "status": "DOWN",
                    "details": {
                        "failureRate": "100.0%",
                        "failureRateThreshold": "50.0%",
                        "slowCallRate": "-1.0%",
                        "slowCallRateThreshold": "100.0%",
                        "bufferedCalls": 5,
                        "slowCalls": 0,
                        "slowFailedCalls": 0,
                        "failedCalls": 5,
                        "notPermittedCalls": 0,
                        "state": "OPEN"
                    }                        
                }
            }
        }
    }
}
We have deployed the same code to PCF (Pivotal Cloud Foundry) so that we can integrate it with NewRelic to create the dashboard for this metric. We have used micrometer-registry-new-relic dependency for that purpose.

Image 2 - NewRelic Insight CircuitBreaker Closed Graph
Rate Limiter - We have created separate API (http://localhost:8084/library/ratelimit) having the same addBook functionality but enabled with Rate-Limit feature. In this case, we would need Book Management Service up and running. With the current configuration for the rate limit, we can have a maximum of 5 requests per 10 seconds.
Once we hit the API for 5 times within 10 seconds of time, it will reach the threshold and get throttled. To avoid throttling, it will go to the fallback method and respond based on the logic implemented there. Below graph shows that it has reached the threshold limit 3 times in the last hour:

Retry - Retry feature enables the API to retry the failed transaction again and again until the maximum configured value. If it gets succeeded, it will refresh the count to zero. If it reaches the threshold, it will redirect it to the fallback method defined and execute accordingly. To emulate this, hit the GET API (http://localhost:8084/library) when Book Management service is down. We will observe in logs that it is printing the response from fallback method implementation.
Bulkhead - In this example, we have implemented the Semaphore implementation of the bulkhead. To emulate concurrent calls, we have used Jmeter and set up the 30 user calls in the Thread group.
We will be hitting GET API () enabled with @Bulkhead annotation. We have also put some sleep time in this API so that we can hit the limit of concurrent execution. We can observe in the logs that it is going to the fallback method for some of the thread calls. Below is the graph for the available concurrent calls for an API:
Summary
In this article, we saw various features that are now a must in a microservice architecture, which can be implemented using one single library resilience4j. Using Prometheus with Grafana or NewRelic, we can create dashboards around these metrics and increase the stability of the systems.
As usual, the code can be found over Github -  spring-boot-resilience4j

Follow by Email

Followers

Total Pageviews

Popular Posts