AWS Lambda vs EC2

Introduction

AWS Lambda has gained good traction for building applications on AWS. But, is it really the best fit for all use cases? 
Since its introduction in 2014, Lambda has seen enthusiastic adoption - by startups and enterprises alike. There is no doubt that it marks a significant evolution in cloud computing, leveraging the possibilities of the cloud to offer distinct advantages over a more traditional model such as EC2.  
In this article, we are going to conduct a fair comparison between EC2 and Lambda covering various aspects of cloud-native features. Let's begin with a quick reminder of what these two services offer, and how they differ.

What is AWS EC2?

Amazon Elastic Compute Cloud (EC2) service was introduced to ease the provision of computing resources for developers. Its primary job is to provide self-service, on-demand, and resilient infrastructure.
- It reduces the time required to spin up a new server to minutes from the days or weeks of work it might have taken in the on-premise world.
- It can scale up and down instantly based on the computing requirement.
- It provides an interface to configure capacity with minimal effort.
- It allows complete admin access to the servers, making infrastructure management straightforward.
- It also enables monitoring, security and support for multiple instances types (wide variety of Operating Systems, Memory, and CPUs).

What is AWS Lambda?

AWS Lambda was launched to eliminate infrastructure management of computing. It enables developers to concentrate on writing the function code without having to worry about provisioning infrastructure. We don't need to do any forecasting of the resources (CPU, Memory, Storage, etc.). It can scale resources up and down automatically. It is the epitome of Serverless Architecture.
Before we start comparing the different features of both of these services, let's understand a few key things about Lambda:
- Lambda was designed to be an event-based service which gets triggered by events like a new file being added to an S3 bucket, a new record added in a DynamoDB table, and so on. However, it can also be invoked through API Gateway to expose the function code as a REST API.
It was introduced to reduce the idle time of computing resources when the application is not being used.
- Lambda logs can be monitored the same way as EC2, through CloudWatch.
- Lambda local development is generally done using AWS SAM or Serverless Framework. They use CloudFormation for deployment.
- Unlike EC2, it is charged based on the execution time and memory used.

Now, let’s take a deeper look at how Lambda and EC2 differ from each other in terms of performance, cost, security, and other aspects:

Setup & Management

For setting up a simple application on EC2, first, we need to forecast how much capacity the application would need. Then, we have to configure it to spin up the Virtual Machine. 
After that, one needs to set up a bastion server to securely SSH to the VM and install the required software, web server, and so on. You’ll need to manage the scaling as well by setting up an Auto Scaling group. And that’s not all. ALB also need to be set up to do the load balancing in case multiple instances of applications are installed using multiple EC2 instances.
In the case of Lambda, you won’t need to worry about the provisioning of VMs, software, scaling or load balancing. It is all handled by the Lambda service. We just need to compile the code and deploy to Lambda service. Scaling is automated. We just need to configure how many max concurrent executions we want to allow for a function. Load balancing will be handled by Lambda itself.
So here, we can see Lambda is a clear winner.

On-Demand vs. Always Available

For EC2, we essentially have to pay for the amount of time EC2 instances are up and running. But for Lambda, it’s the amount of time functions are up and running. The reason is that Lambda is brought up and spun down automatically based on event sources and triggers. This is something we don’t get out of the box while using EC2. So, while an EC2 container is always available, 
Lambda is available based on the request invocation. The advantage goes to Lambda functions since we are no longer paying for the idle time between invocations, which can save a lot of money in the long run.

Performance

There are various aspects to cover when we take performance into consideration. Let's discuss them one by one. 

1. Concurrency and Scaling

With EC2, we have full control in implementing the concurrency and scaling. We can use EC2 Auto Scaling groups to define the policies for scaling up and down. These policies involve defining conditions (avg. threshold limits) and actions (# of instances to be added or deleted). 
However, it requires a lot of effort to identify the threshold limits and accurately forecast the # of instances required. It can only be done by carefully monitoring the metrics (CloudWatch, NewRelic, etc.).
However, with Lambda, concurrency and scaling are handled as a built-in feature. We simply have to define the maximum number of concurrent executions we want a function to be restricted to. There are a few limitations though. It can have a max. 3 GB memory. So if a program needs to scale vertically for memory, it can't do that more than 3 GB. For horizontal scaling, the maximum limit is 1,000 concurrent executions. If your Lambda is deployed in a VPC, then it is even further restricted based on the number of IP addresses available for the subnets allocated.
So, EC2 gives you more flexibility but requires manual configuration and forecasting. Lambda is designed to do all of that by itself but has a few limitations.

2. Dependencies

It is inevitable to run an application without external libraries. When we use EC2, there is no constraint to limit the number of dependencies for an application. However, the more dependencies an application has, the more time it will take to start. It will add a burden to the CPU as well.
However, with Lambda, there are constraints in terms of the maximum size of a package - 50 MB (zipped, for direct upload) and 250 MB (unzipped, including layers). Sometimes, these sizes are not sufficient, especially for ML programs where we need a lot of third-party libraries. 
AWS recommends using /tmp directory to install and download the dependencies during function runtime. However, it can take significant time to download all the dependencies from scratch when a new container is being created. So, this option is good when your lambda container is up most of the time, otherwise, it may cause a long cold start time for each invocation. Also, /tmp folder can hold a maximum of 512MB only. So, it is again restricted for limited use only.
EC2 is a clear winner here.

3. Latency

Comparing the latency between EC2 and Lambda is not straightforward. It depends on the use cases. So, let’s try to get a clearer picture by going through a few examples.
Let's take the first example where the application is used only a few times a day in the interval of 2-3 hours. 
Now, if we use EC2, it will be running for the whole day and latency for the first request will be high but for all subsequent requests will be comparatively less. The reason is, when the EC2 instance is provisioned, all the scripts are run to set up the OS, software, EBS and other things. 
If we use Lambda, the application doesn't need to be running for a whole day. Lambda container can be spun up based on each request. However, it will involve cold start time, which is not significant compared to the EC2 instance setup time. So, for this use case, Lambda will have more latency per request than EC2 but significantly less than the EC2's first request. To reduce this time, some teams create a Lambda function which will periodically call the application Lambda functions to keep them warm. However, it is going to increase the bill, so you need to keep a balance. 
If we take another example, where the application needs to scale up and down frequently, In this case, EC2 will need to scale up to handle the increased volume of requests. This will impact the latency of the requests. While with Lambda, scaling will be comparatively fast and latency will be less.
So, latency will ultimately depend on the use cases and other local factors (cold start time for Lambda and resources setup time for EC2).

4. Timeout

Lambda has more timeout constraints than EC2. If we have long-running workloads, Lambda may not be the right choice as It has a hard timeout limit of 15 minutes. But EC2 doesn't have such kind of restriction. 
AWS has introduced Step Functions to overcome the Lambda timeout constraint. Also, if we have a Lambda function exposed as REST API through API Gateway, it also has a timeout limit of 29 seconds at the gateway. 
Timeouts don’t occur only due to these limits, but the downstream system's application integration as well. And, that can happen to both EC2 and Lambda functions. 
One more thing to note in the EC2 case is that if we don't configure security groups appropriately, it may also cause timeout errors.

Cost

To understand how EC2 and Lambda services compare on cost, let’s run  through a couple of examples:
1. Let's assume an application has 5,000 hits per day with each execution taking 100 ms with 512MB. So the cost for the Lambda function will be $0.16.

Now, for the same requirement, I believe I can use t2.nano EC2 instance. And, if we look at the cost for this instance, it will be $4.25.

We can see, the Lambda cost ($0.16) is just ~4% of the EC2 price ($4.25).
2. Let's take the second scenario where an application has a lot of hits, say 5 million per month, and each execution takes 200 ms with 1GB Memory. If we use Lambda, it will cost us $17.67. 


However, if we use EC2 and, I believe, t3.micros should be able to handle this load, then it will cost us only $7.62. 
So, in this case, EC2 is a cheaper solution than Lambda due to the high requirement of memory/request #/execution time. 
3. Now, take an example where multiple EC2 instances would be required to handle the requests. In that case, EC2 would be costlier for two reasons. First, we need an ALB (Application Load Balancer) to handle the load balancing between those instances. It will add to the cost. Second, EC2 will eat up some memory being allocated and traffic is also not evenly distributed always so we would need more EC2 instances than anticipated. Lambda can handle the load balancing internally so no extra cost is added while scaling.

Security

When we talk about security for Lambda, most of the onus is on the AWS side which includes OS patching, upgrades, and other infrastructure-level security concerns. Generally, the malware sits idle on a server for a long time and then starts growing slowly. That is not possible in Lambda as it is stateless in nature.
On the other hand, for EC2, we have full control to define system-level security. We need to configure the security groups, Network ACLs, and VPC subnet route tables to control the traffic in and out for an instance. However, it's a tiring job to ensure the system is fully secure. Security groups can grow to take care of business needs but can become overlapping and confusing sometimes.

Monitoring

Despite EC2's resiliency and elasticity, there are many ongoing objectives that require close tracking of capacity, predictability, and interdependence with other services. Let's talk about some of the important metrics that need to be monitored for EC2.
1. Availability - To avoid an outage in production, we need to know if each of the EC2 instances running for the applications is healthy or not. EC2 has "Instance State" which can be used to track it.
2. Status Checks - AWS performs status checks on all the running EC2 servers. System status checks monitor conditions like loss of network connectivity and hardware issues. That requires AWS’s involvement to fix. Instance status checks monitor conditions like exhausting memory and a corrupt file system. That requires our involvement to fix. The best practice is to set a status check alarm to notify you when a status check fails.
3. Resources Capacity - CPU and Memory utilization are directly related to application responsiveness. If it gets exhausted, we will not even have sufficient memory to do SSH on the instance. The only option would be to reboot the instance which can cause downtime and state loss. A good monitoring system will store metrics from the instance and can show us an increase in its resource usage until eventually hitting a ceiling and becoming unavailable.
4. System Errors - System errors can be found in the system log file like /var/log/syslog. We can aggregate these logs to Amazon CloudWatch Logs by installing their agent, or we can use syslog to forward the logs to some other central location like Splunk or ELK.
5. Human Errors - EC2 needs a lot of manual configuration and sometimes it may go wrong. So we need tracking of such activities, which can be done through CloudTrail audit logs.
6. Performance Metrics - Through CloudWatch Logs, we can monitor CPU usage and disk usage. However, it doesn't provide any metrics for application performance monitoring. And that's where we would need to use APM tools.
7. Cost Monitoring - EC2 instances count, EBS volume usage, and Network usage is very important to monitor as auto-scaling can open a serious risk to the overall AWS billing. CloudWatch helps to get some information about the network usage for an instance but doesn't give overall information of how many instances are being used. Also, we would need to know about storage and network usage at an account level. And that is something that’s missing in CloudWatch Logs.
So, most of the metrics can be tracked using CloudWatch, CloudTrail, and X-Ray but there are still, a few gaps to be filled.
Now, let's talk about Lambda monitoring. Cloudwatch provides all the basic telemetry about the health of the Lambda function out of the box:
 -Invocation Count
- Execution Duration
- Error Count
- Throttled Count

In addition to these built-in metrics, we can also record custom metrics and publish them to CloudWatch Metrics. 
However, there are a few limitations to these metrics. It doesn't cover concurrent execution metrics, which is the most common feature of Lambda. For cold start count and downstream integration-related metrics, we have to rely on X-Ray monitoring. There are also no metrics available for memory usage, but it can be done using custom metrics.
Now, we have a much better understanding of the major differences between EC2 and Lambda in various aspects. So, let's talk about which one to use for a given use case.

Use Cases 

There are certain use cases where there is no competition between the two. For example, If we need to set up a DB like Couchbase or MongoDB, we have to go for EC2 only. We can't do that in Lambda. Another example would be if we need a hosting environment for Backup & Disaster Recovery. Again, EC2 would be the only choice. 
However, there are certain use cases where developers might be a little uncertain about which one to use. 
1. High-Performance Computing Applications - Although Lambda claims that it can perform real-time stream processing if these processes need high compute, it cannot handle it. Remember, Lambda has only 3 GB memory. And it may cause you high execution time, leading to either timeout issues or a higher bill. On the other hand, EC2 has no such restriction and is an ideal fit for these kinds of requirements.
2. Event-Based Applications - Lambda has been primarily designed for handling event-based functions and it does that best. So if a new record is added to DynamoDB or a new file added to an S3 bucket needs processing, Lambda is the best fit. It's very easy to set up and saves cost as well.
3. Security Sensitive Applications - AWS claim that it takes care of Lambda security very well. But remember one thing: Lambda functions are running in a shared VPC that may be shared with other customers in a multi-tenancy setup. So, if an application has highly sensitive data, and security is your primary concern, Lambda functions should be running under a dedicated VPC or use EC2 only. And don’t forget, running the Lambda under VPC has its own challenges like increased cold start time, limited concurrent executions, etc.
4. Less Accessed Applications or Scheduled Jobs - If the application is used very rarely, or should be invoked based on schedule, Lambda is the right fit for it. It will save money as there is no need to run the server all the time.
5. DevOps and Local Testing - DevOps has been developed for EC2 for years and has reached a good level of maturity, but Lambda is still going through that journey. AWS SAM and Serverless Framework is addressing those concerns. Local testing is another aspect you need to consider while using Lambda as it has few limitations in regards to what can be done.

Summary

In this article, we have understood that if a team doesn't want to deal with infrastructure, then Lambda will be the right choice. However, it comes with limitations in terms of the use cases it can run. Also, constant monitoring is a must to ensure there is a good balance between the ease it provides and the cost.
EC2 has always been a standard choice for hosting any application and gives us full flexibility to configure our infrastructure. However, it is not best suited to all needs, and that's what where Lambda comes into the picture. 
Keep using both services based on the considerations I have shared and do let me know your feedback.

No comments: