Pulumi on AWS Lambda - Running containers at scale

A while back I was looking into running containers on-demand on AWS. These containers were part of an application that together served as a testing playground for users.

Application instance consists of multiple containers running in tandem

The idea was to enable users to provision an on-demand instance of this application so that they could connect to it and work in isolation. Once done with their work, the users could simply deprovision this instance.

Let's go through this multi-tenancy scenario in more detail and see how we can build infrastructure on AWS with Pulumi to support this use case.

Requirements #

The following requirements should be met:

Host docker images for all underlying services of the application
User-facing HTTP API for provisioning / deprovisioning on-demand instance(s) of the application
- Parallelize deployments to support possible bursts in traffic where many users provision many instances of the application
Use Pulumi for provisioning all infrastructure
Optimize for cost and minimize operational overhead
- Ensure minimal costs for periods where no applications are provisioned

Containers on AWS #

Let's see which AWS services fit our use case best.

Container registry #

To support running multiple containers, we first need to host the docker images. Elastic Container Registry (ECR) is the obvious choice for hosting docker images on AWS. While it's possible to set up your own private docker registry for hosting the docker images, there's a lot of operational overhead and costs associated with it. Fortunately, the requirements call for keeping both of these at a minimum.

Container orchestration #

We have the following options for running containers on AWS.

EKS #

Elastic Kubernetes Service (EKS) is the managed Kubernetes service offered by AWS. While EKS reduces the complexity of setting up a Kubernetes cluster on AWS, using Kubernetes is still far too complex for our simple use case.

Furthermore, it is expected that there will be periods when no users are using the system. Running an EKS cluster 24/7 will only incur additional costs. For reference, it would cost $0.1/hour ($73/month) for a single EKS cluster without any EC2 nodes.

In short, the flexibility provided by EKS is not worth the added complexity, operational overhead nor costs.

ECS #

Elastic Container Service (ECS) is a fully managed container orchestration service. When used with AWS Fargate, no management of EC2 instances or clusters is required.

However, this option is not as flexible as using EKS with EC2 nodes, but acceptable for our use case since we only want to run some containers on AWS. As a bonus, there are no charges for using ECS itself. We only need to pay for the resources consumed by our containerized applications. Therefore, during periods with no users using the system, our ECS/Fargate costs are essentially $0.

Deploying Containers on-demand #

Deploying containers on ECS Fargate via Pulumi shouldn't be too difficult. How can we programmatically drive this Pulumi deployment on-demand via an HTTP API?

Luckily for us, Pulumi Automation API exists. It is a programmatic interface for running Pulumi programs without the Pulumi CLI. Our HTTP API request handlers can directly embed and deploy Pulumi programs from within our application code. This fits our use case perfectly and should make the deployments simpler.

Pulumi also provides an example of a similar use case on their Github: Pulumi Over HTTP - Static Websites as a RESTful API.

HTTP API #

Let's quickly define what this user-facing deployment HTTP API should look like. It should support the following functionality:

POST /apps : create a new stack via Pulumi Automation API and provision an instance of the application (running on ECS Fargate as multiple services). It should return an endpoint for accessing the application.
GET /apps : list all applications via Pulumi.
DELETE /apps/:id : destroy a previously created Pulumi stack and deprovision an instance of the application including all underlying resources.

What AWS services would be suited for serving this HTTP API? Keep in mind our requirement of parallelizing deployments and keeping costs minimal when no deployments are active.

ECS Fargate #

We could serve our HTTP API as an ECS Fargate Service. All that we would need to do is package our HTTP API and the Pulumi CLI inside a docker image, push it to ECR, create a task definition, and finally deploy it as a Fargate service.

What would happen during a sudden surge of deployment requests?

Each user request would result in a new Pulumi stack being deployed. Depending on the complexity of the application deployment, this could range from a couple of seconds to minutes before a response is sent back to the user. The service could eventually run out of resources to handle any more requests.

Our current setup would not be able to scale automatically to handle this scenario. We could additionally setup an Application Load Balancer and Auto Scaling for this service to guarantee that we can scale horizontally.

Another approach could be to split the HTTP API request handling from the Pulumi deployment and deploy them as separate services that communicate via Amazon SQS.

Both of these approaches will offer much better scalability than our initial approach. Nevertheless, they are more complex to set up and would incur additional costs.

Is there another approach that is inherently scalable without any additional complexity? Can we have our cake and eat it too?

Lambda #

AWS Lambda offers near-instantaneous scalability without using auto-scaling groups and load balancers. Lambda functions will scale up and down automatically depending on the traffic patterns, and can support tens of thousands of concurrent executions. This rapid scaling should guarantee concurrent execution for virtually all our deployments. Additionally, for periods with no user traffic, the cost of running a Lambda function is essentially $0.

Is it possible to run Pulumi programs on an AWS Lambda function?

Yes, this is possible. We can package the API request handlers, the Pulumi program responsible for container deployments, and the Pulumi CLI as a Lambda Container Image.

Furthermore, earlier this year, AWS announced Lambda function URLS. This means we don't necessarily have to to use any other AWS services for routing requests to our Lambda function (i.e. API Gateway / Application Load Balancer). Consequently, using the feature should greatly simplify our infrastructure setup.

To demonstrate the credibility of the above claims, I have open sourced a minimal Pulumi program that showcases the use of Pulumi Automation API from within an AWS Lambda function.

Nevertheless, there are some limitations associated with using Lambda functions for our container deployment use case:

Function timeout: each function invocation must complete within 15 minutes. If our deployments take longer than this, using Lambda would not be suitable.
Invocation payload: each request and response must not exceed the payload size of 6 MB for synchronous invocations.
Ephemeral storage: Lambda functions were restricted to use only up to 512MB of ephemeral storage in the past. Fortunately, this limit was recently increased to 10GB. Ephemeral storage is essential for programs using the Pulumi Automation API when installing plugins at runtime.

Alhamdulillah, none of these limitations are a deal breaker for our use case. Even though the Lambda function timeout initially looks daunting, in practice Pulumi deployments of ECS Fargate services finish well before the 15 minutes timeout.

Conclusion #

The goal was to enable users to provision on-demand instances of an application that consisted of multiple containers running in tandem. We looked into various AWS services, and settled on the following:

ECR: container registry for hosting docker images
ECS Fargate: container orchestration for running instances of the application across Fargate services
AWS Lambda with function URLs: HTTP API for handling user requests for provisioning and deprovisioning application instances (ECS Fargate services).

Final architecture depicting usage of AWS services

Using Pulumi Automation API from within an AWS Lambda function empowered us with a powerful yet flexible method for handling container deployments at scale.

To conclude, we have proposed an architecture that would be highly scalable and available, minimize costs and operational overhead while fulfilling our requirements.

Source code #

The source code for a minimal Pulumi program that showcases the use of Pulumi Automation API from within an AWS Lambda function can be found on Github.

Since you have made it this far, sharing or retweeting this article would be highly appreciated! For feedback, please join the conversation on Twitter or ping me directly.

Published 29 Nov 2022