Sunday, December 1, 2019


AWS has recently launched the first time ever a Serverless Event Bus AWS EventBridge. Some say it is the extension of CloudWatch Events, others say it is providing the same features what SNS service offers.

In this article, we will talk through what exactly AWS EventBridge service is, where it can be used. Also, how it is different from CloudWatch Events, SNS and Kinesis services.

What is AWS EventBridge Service

EventBridge is bringing together your own (e.g. legacy) applications, SaaS (Software-as-a-Service) and AWS Services. It can stream real-time data from various event sources like Pagerduty, Datadog, and routes to various targets (AWS services) like SQS, Lambda, and Others. 
It supports 10 SaaS application partners and 90+ AWS Services as event sources. It supports 17 AWS Services as Targets and this list will grow over the years.

In simple terms, AWS EventBridge is an event bus which supports publish/subscribe model. Applications can publish events to it and it can fan-out to multiple target services. Now, the basic question would arise; what is new in it. The Event Bus is an old concept and AWS itself is providing that functionality through CloudWatch Events. 

Genesis of EventBridge

EventBridge is brought mainly to address the problems of SaaS platform integration with AWS services. In the current Cloud world, SaaS platforms like CRM, Identity providers have become key partners. They generate a large amount of data and need to pass it to AWS platforms for business processing. Before EventBridge, there were majorly two solutions to send these event data to AWS:


In this solution, we generally set up a Cron job or CloudWatch Scheduler to go out and call the SaaS APIs. It will check if any change in data and then pull the data. Polling frequency can be minutes to hours depending on the use case and SaaS platform capacity to bear the load. Look at the typical flow as below:

This solution looks simple but it brings two major issues:

  • Data Freshness issue - Scheduler will be calling the API may be every few minutes or an hour. So, it will not give real-time data. There will always be a gap which might not work in some of the business scenarios. 
  • Costing and Performance issues - To alleviate the data freshness issue, if we reduce the interval of polling, it will increase the cost as calls will be increased. Also, more resources will be consumed at the SaaS platform. This may cause throttle and slow performance issues.
So, the overall recommendation is to avoid Polling mechanism if you can.

SaaS Webhooks

This is another technique which eliminates the data freshness issue. Here, we find out an Http endpoint of the AWS hosted application which SaaS platform can call to send the events data. SaaS platform will set up the webhooks and send real-time data when records change. Look at a typical use case below:

In this flow, we still need to manage the public endpoint of the application for handling security/DDoS attacks. It will also require Authentication handling. In AWS, mostly it is done through with API Gateway or WAF/ALB option. We would need to write the code as well to handle the events.

So, looking at these shortcomings, AWS came up with EventBridge service which enables SaaS platforms to create Native Event source at AWS side and have a secure connection just by sharing Account Id and region with platforms. It not only solves the issue of real-time data processing but also takes care of event ingestion and delivery, security, authorization, and error handling for you.


Now, let us talk about what all the options are available in AWS itself for Event routing and how to compare them with EventBridge. Here are the options for event routing:
  • CloudWatch Events
  • SNS
  • EventBridge
  • Kinesis
CloudWatch Events vs EventBridge

CloudWatch Events can support only AWS services as event sources. It uses only the default event bus. Default event bus accepts events from AWS services, PutEvents API calls, and other authorized accounts. You can manage permissions on the default event bus to authorize other accounts.

EventBridge provides an option to create custom event buses and SaaS event bus on top of the default bus. The custom event bus is used to handle custom events raised by using PutEvents APIs. SaaS event bus is used to channel through events triggered by SaaS platforms. 

For default bus, EventBridge leverages the CloudWatch Events API, so CloudWatch Events users can access their existing default bus, rules, and events in the new EventBridge console, as well as in the CloudWatch Events console.

SNS vs EventBridge

SNS is a well-known event sourcing service. It shines very well when the throughput is very high maybe millions of tps. EventBridge supports 400 requests per second only.

However, the number of targets supported by SNS is limited compared to EventBridge.
For example, if an event needs to trigger Step Functions, it cannot do it directly as it is not available as a Target. It needs to call Lambda function and that can trigger the Step Functions. On the other hand, EventBridge supports 17 targets as of now. But, each Rule in EventBridge can configure max 5 targets.

SNS scales practically infinitely, but filtering is limited to attributes, not event content. SNS doesn't give the guarantee on the ordering of the messages.

Kinesis vs EventBridge

Kinesis service can be used as an event routing as well as event storing. This is an ideal solution for processing real-time data at large. It can fan-out to multiple consumers however, there is a limitation on the number of consumers can connect to a single stream. Each individual consumer would be sort of responsible for the kind of filtering out any messages that they weren't potentially interested in.

Kinesis also provides ordering guarantees. However, it doesn't have an entirely usage-based pricing model. It doesn’t automatically scale to demand. 

On the other hand, EventBridge cannot buffer the events. It needs SQS or Kinesis integration for event storing. 

Use Cases -

Let's take a couple of use cases and see how they will be implemented using SNS and EventBridge.

1.  If I want to build a system where if an EC2 instance is down, it should reboot the EC2 instance and also trigger a Lambda function to store the incident to the DynamoDB table.

If I build it using SNS as an event routing service, it would need to use SQS as well as it cannot be subscribed by EC2 directly. Here is the design for this solution:

If we implement the same use case using EventBridge, the design will be like this:

We can see the design is much simpler. With a fewer number of services, we are able to implement it.

2.  Let's take another use case where an employee resigns from the Organization and his record is updated in the CRM tool. It needs to trigger different workflows for all the approvals as part of exit checklist.

If we implement this use case using SNS, the design will look something like this:

If we use EventBridge, the design will be much simpler. It doesn't need polling, CloudWatch Scheduler, and Lambda functions. The design will look somewhat like this:

Things to Remember

      Now, we understand what is EventBridge service and how it can be used to make our design simpler in AWS. Let's keep a few things in mind while using this service:
  • Pricing for EventBridge is same as CloudWatch Events. Its dollar per 1 Million events published to the event bus.
  • CloudFormation is still not supported for Custom/SaaS event bus. This feature is yet to be released. However, for default bus, it is supported.
  • EventBridge will ensure to have a successful delivery to Targets. If failure happens, it will retry for 24 hours only before marking it as failed. In the case of Lambda, what successful delivery means from the EventBridge perspective is that it was able to asynchronously invoke your function. So when it gets a success back from the Lambda service saying that, "hey, yeah, the invoke call you made with success.". So then, at that point, you're really kind of relying on the standard Lambda retry policy for the failure handling within that kind of async invoke flow.
  • EventBridge makes connection seamless for an AWS service from an AWS account to another AWS service in a different account. It has a target option as "event bus from another account".
  • EventBridge needs SQS to bring resiliency but kinesis has that feature built-in.

    • A word of warning on the event bus. It is very hard for consumers to use it without having some kind of event schema registry. Event Schema Registry makes it possible to search for an event type and to build version the schemas so consumers and publishers understand what they are working with.


    In this article, we understood how EventBridge is helping to solve the SaaS platform integration with AWS services. Also, for existing AWS services, integration has become much simpler and smooth. However, from a security perspective, there is no much information documented for SaaS platform integration. For enterprise-level companies, it matters a lot as we are just giving AWS account id and region information to the vendor. I hope, documentation will mature eventually. 

    Sunday, November 3, 2019

    AWS brought Serverless Application Model (SAM) to ease the building of the serverless application on AWS. At a high level, it is an extension of AWS CloudFormation. AWS SAM consists of two major components - SAM template specifications, SAM CLI.

    In this article, we are going to talk about 5 tips to get the most out of AWS SAM templates:

    1. Reuse Common Pattern using Nested Applications

    As serverless architectures grow, common patterns get reimplemented across teams, and projects. This hurts the development velocity and leading to wasted effort. To avoid that, AWS has come up with nested applications in AWS SAM and the AWS Serverless Application Repository (SAR) to make these patterns shareable publicly and privately.

    Nested applications are built off a concept in AWS CloudFormation called nested stacks. Serverless applications are deployed as stacks that contain one or more other serverless application stacks.

    Let's take an example. If we want to build an API made up of AWS Lambda and Amazon API Gateway, we can use AWS SAM to create Lambda functions, configure API Gateway, deploy and manage them both. Now, We want to secure this API. API Gateway has several methods for doing this. Let's say we want to implement basic HTTP Basic Auth. So, Instead of creating this functionality from scratch, we can search it in SAR.

    We can review the AWS SAM template, license, and permissions provided in SAR. If it meets our requirements, just copy the SAM Template of it and put it in our application SAM template under "Resources" tag.
    These applications will have a type - AWS::Serverless::Application

    Now, we can check the output of this application and refer that in the main API Authorizer like below:

    With this simple piece of code, we can reuse any existing code resided in SAR.

    2.  Reduce Line of Code Using Globals

    Serverless Architecture is all about breaking large application into smaller functions. So, if we end up having 10-20 Lambda functions for my application and we configure that in SAM template, we will notice so many Function definition under "Resources" tag.

    In AWS SAM template, Globals is a section to create properties common to all the Serverless Function and APIs.

    All the AWS::Serverless::Function and AWS::Serverless::Api resources will inherit the properties defined here. Here is a typical example:

    3. Enable a feature using SAM Parameter and Mappings

    Let's suppose we want to enable a feature in our application based on the flag value set in an environment variable. It can get complicated if we want to do it based on the deployment environment like testing/prod.
    For example, I have a feature "Download PDF" which I want to enable only in the test environment for now and don't want to publish to Prod until business approves it. So, the flow will be that an environment value (testing/prod) will be passed as a parameter. There should be a collection object like Map which should hold the status value (on/off) for each environment. Now, logic has to be put to retrieve the status dynamically and set in an environment variable so that the application can behave accordingly.
    Using SAM Parameters and Mapping tags, this can be done very easily. See below SAM template on how it is implemented.

    Here !FindInMap will search DocumentEnvironment parameter value (testing/staging/prod) in DownloadPDFFeature Map and retrieve "status" field value and set it with "Download_Feature1" variable.

    4. Safe Deployment using AutoPublishAlias and DeploymentPreference

    AWS Lambda has Versioning and Alias feature which helps to do the increment deployment of Functions. When an enhancement happens on a function, it can be published with a bumped-up version and an Alias can point to a version which we want to expose to the consumers. Below is the example:

    In SAM, we can bump-up this version by just using one property - AutoPublishAlias

    AWS SAM will do the following 3 tasks behind the scene:

    • Detect when new code is being deployed based on changes to the Lambda function's Amazon S3 URI.
    • Create and publish an updated version of that function with the latest code.
    • Create an alias with a name provided through ENVIRONMENT parameter and points to the updated version of the Lambda function.

    Apart from this, DeploymentPreference property will help to enable the canary and liner deployment strategy which ensures that if there is any problem with the newer version of the function, it can be rolled back. For example, in the above code, it is going to redirect only 10% of the traffic to the new version for 10 minutes and then keep increasing by 10% every 10 minutes. So, we can monitor the new version and rollback if any issues found.

    5. Enable Security using Policy Templates

    For any Lambda Function, the most important thing is to define what execution it can do and who can invoke it. That's where we define the security around it. SAM has a concept called Policy Templates. Using these templates, having 2 lines of code, corresponds to a complete IAM policy that will be keyed in.

    In the above example, we have used DynamoDBCrudPolicy template which corresponds to the below IAM policy.

    Using this feature we are able to reduce the complexity of the security configuration and standardize the security of Lambda Function.


    AWS SAM has been introduced to reduce the complexity of building serverless applications and that is very evident by looking at above basic tips. There are several other features available in SAM but we will be parking them for the next article. That's all for now.

    Friday, August 30, 2019

    All modern applications are nowadays being developed using either Serverless or Containers technology. However, it is always difficult to choose the one best suitable for a particular requirement.

    In this article, we will try to understand how these two are different from each other and in what scenario we can use one or other.

    Let us first start with understanding the basics of Serverless and Container technology.

    What is Serverless?

    Serverless is a development approach that replaces long-running virtual machines with computing power that comes into existence on demand and disappears immediately after use.
    Despite the name, there certainly are servers involved in running your application. It’s just that your cloud service provider, whether it’s AWS, Azure, or Google Cloud Platform, manages these servers, and they’re not always running.
    It is trying to resolve below issues:
    • Unnecessary charges for keeping the server up even when we are not consuming any resources
    • Overall responsibility for maintenance and uptime of the server.
    • Responsibility for applying the appropriate security updates to the server.
    • As our usage scales, we need to manage to scale up our server as well. And as a result, manage to scale it down when we don’t have as much usage.

    What is Containers?

    A container is a lightweight, stand-alone, executable package of a piece of software that includes everything needed to run it: code, runtime, system tools, system libraries, and settings.
    Containers solve the problem of running software when it has been moved from one computing environment by essentially isolating it from its environment. For instance, containers allow you to move software from development to staging and from staging to production and have it run reliably regardless of the differences of all the environments.

    Comparison between Serverless vs Containers

    To start with, it’s worth saying that both — Serverless and Containers point out an architecture that is designed for future changes, and for leveraging the latest tech innovations in cloud computing. While many people often talk about Serverless Computing vs Docker Containers, the two have very little in common. That is because both technologies aren’t the same thing and serve a different purpose. First, let’s go over some common points:
    1. Less overhead
    2. High performance
    3. Requires less interaction at the infrastructure level to do provisioning.
    Although Serverless is more innovative technology than Containers, they both have their disadvantages and of course, benefits that make them both useful and relevant. So let’s review the two.

    Lambda Functions is "Short-lived". Once its executed, it will spin down. Lambda has a timeout threshold of 15 minutes. Long-running workloads cannot run on this. However, Step-Functions can be used to break the long-running Application logic into smaller steps (Functions) and run it. But, it might not apply to all kinds of long-running application.
    ECS is "long-running" containers. It can run as long as you want.
    If an application is having High throughput, say 1Million requests per day, Lambda would be costing higher compare to container solutions. The reason is, it would need a higher resource like Memory and execution time will be high. As Lambda charges based on memory and execution time, the cost will increase in the multiplication factor. The second reason is that 1 function can have maximum 3GB Memory and it might not be able to handle the high throughput and would need concurrent execution which may introduce latency due to cold start time.
    ECS uses EC2 instances to host the applications. EC2 can handle high throughput more effectively than Serverless Functions as it has different types of instance types which can be used as per throughput requirement. Its cost will be comparatively less. Latency will also be better if a single EC2 instance can also handle such kind of load.
    For lower Throughput, Lambda is a good choice in terms of cost, performance, and time to deploy.
    For lower throughput also, EC2 works very well. While comparing with Lambda, need to consider other factors described in this table.
    Lambda has auto-scaling as a built-in feature.
    . It scales the functions with concurrent execution.
    . However, there is a max limit (1000 concurrent execution) at the account level.
    . Lambda horizontal scaling is very fast however, there will be very minimal latency due to cold start time.
    Containers don't have any constraints on scaling. However,
    . We would need to forecast the scaling requirements.
    . Also, it has to be designed and configured manually or automate it through scripts.
    . Scaling containers process is slower than scaling Lambda.
    . Also, higher the number of worker nodes we have, more the problems it will add to the maintenance like handling latency, throttling issues.
    Time to Deploy
    Lambda Functions are smaller in size and take significantly less time compared to containers. It takes milliseconds to deploy compared to seconds in container case.
    Containers take significant time initially to configure and set up as it would require system setting, libraries. However, once it is configured, it takes seconds to deploy.
    In Serverless Architecture, infrastructure is not used unless the application is invoked. So, it will charge only for the server capacity that their application use during the uptime. Now, this can be cost-effective in some scenarios like:
    . Application is used rarely (once or twice a day)
    . Application has frequent scale up and down requirement due to the user request throughput changing frequently.
    . An application needs fewer resources to run. Because Lambda cost depends on memory and execution time. If it is compared with Container cost running 24 hours, it always wins.
    Containers are constantly running, and therefore cloud providers have to charge for the server space even if no one is using the application at the time.

    If Throughput is high, Containers are better cost-effective compared to Lambda.

    While comparing with EKS cluster, ECS cluster is free.
    For Lambda, system security is taken care of by AWS itself. It only needs to handle application-level security using IAM roles and policies. However, if Lambda has to run in a VPC, then VPC level security has to apply here.
    For Containers, we are also responsible for applying the appropriate security updates to the server. This includes patching OS, upgrades to software and libraries.
    ECS supports IAM Roles for Tasks which is great to grant containers access to AWS resources. For example, to allow containers to access S3, DynamoDB, SQS, or SES at runtime. EKS doesn't provide IAM level security at pods level.
    Vendor Locking
    Serverless function brings Vendor Locking as if you need to move the Lambda function to Azure function, it would need significant changes at code and configuration level.
    Containers are designed to run on any cloud platform which supports container technologies. So it brings the benefit to build once and run anywhere. However, the services being used for Security - IAM, KMS, Security Groups, and others are tightly coupled with AWS. It would need some rework to move this workload to other platforms.
    Infrastructure Control
    If a team doesn't have infrastructure skills, Lambda will be a good option. The team can concentrate on business logic development and let AWS handle the infrastructure.
    With Containers, we get full control of server, OS, Network components. We can define and configure within the limitations put by Cloud providers. So, if an application/system needs fine-grained control of infrastructure, this solution works better.
    Lambda doesn't need any maintenance work as everything at the server level is being taken care of by AWS.
    Containers need for maintenance like patching, upgrade and that would require skilled resources as well. So, keep this in mind while choosing this architecture for deployment.
    State persistence
    Lambda is designed for serverless so it will not maintain any state. It is short-lived. Because of this reason, we cannot use caching for it and that may cause latency problem.
    Containers can leverage the benefits of caching.
    Latency & Startup Time
    For Lambda, cold start and warm start time are key factors to be considered. As they may cause latency as well as add to the cost of executing functions.
    Containers being running always doesn't have cold/warm start time. Also, using caching latency can be reduced.
    While comparing with EKS, ECS doesn't have any proxy concept at the node level. Load balancing is just between ALB and EC2 instances. So no extra hop of latency.
    VPC &ENI
    If Lambda is deployed in a VPC, its concurrent execution is limited by ENI capacity of the subnets.
    The number of ENIs per EC2 instance is limited from 2 to 15 depending on the instance type.
    In ECS, each task is assigned only a single ENI so we can have a maximum of 15 tasks per EC2 instance with ECS.
    Monolith Applications
    Lambda is not fit for Monolithic application. It cannot run complex type of application
    ECS can be used to run a monolith application
    Testing is difficult in serverless based web applications as it often becomes hard for developers to replicate the backend environment in a local environment.
    Since containers run on the same platform where they are deployed, it’s relatively simple to test a container-based application before deploying it to the production.
    Lambda monitoring can be done through CloudWatch, X-Ray. Need to rely on Cloud vendor to provide monitoring capabilities. However, infrastructure level monitoring is not required in this case.
    Container monitoring would require to capture Availability, System Errors, Performance and Capacity metrics to configure HA for the container applications.

    When to use Serverless

    Serverless Computing is a perfect fit for the following use-cases: 
    1. If the application team doesn’t want to spend much time thinking where your code is running and how!
    2. If the team doesn't have skilled infrastructure resources and worried about the cost of maintenance of servers and resources application consumes, serverless will be a great fit for such use-case.
    3. If the application's traffic pattern changes frequently, it will handle it automatically. It will also even shut down when there is no traffic at all.
    4. Serverless websites and applications can be written and deployed without handling the work of setting up infrastructure. As such, it is possible to launch a fully-functional app or website in days using serverless.
    5. If a team needs a small batch job which can be finished within Lambda limits, its a good fit to use.

    When to use Containers

    Containers are best to use for Application deployment in the following use cases:
    • If the team wants to use the operating system of their own choice and leverage full control over the installed programming language and runtime version.
    • If the team wants to use software with specific version requirements, containers are great to start with.
    • If the team is okay in bearing the cost of using big yet traditional servers for anything such as Web APIs, machine learning computations, and long-running processes, then they might also want to try out containers as well (They will cost you less than servers anyways)
    • If the team wants to develop new container-native applications
    • If the team needs to refactor a very large and complicated monolithic application, then it’s better to use the container as it’s better for complex applications.


    In a nutshell, we learned that both the technologies are good and can complement each other rather than competing. They both solve different problems and should be wisely. If you need help how to design and architect your application, reach out to me. 

    Follow by Email


    Popular Posts