Simple time series analysis and forecasting is a great candidate for running in a simple on-demand compute engine, in other words a serverless function. The concept here is to run the statistical model on a time series and return some forecast on that result. The serverless function does not need to know or care about how the result will be stored or used.
When I first started, my original plan was to build this service entirely using AWS Lambdas. I built the function and installed dependencies locally and ran the function with sample inputs just to see if I could.
When it turned out I could, I decided that it was time to try deploying the function to a Lambda. I decided to use serverless to simplify the deployment process.
Setting up serverless was simple using the provided template. For python projects where you want to use dependencies, the serverless-python-requirements
plugin is needed.
I managed to get it all set up as far as I could tell and then I hit a wall. AWS Lambdas have an upload compressed size limit of 50MB and uncompressed limit of 250MB. The serverless-python-requirements
plugin provided a way to compress the dependencies before compressing the rest of the function effectively compressing the dependencies twice. This resulted in an uncompressed function within the 250MB limit. However, I could not get the dependencies working correctly after that.
The next thing I tried was AWS SAM. AWS SAM worked well and the lambda uncompressed was smaller than 250MB. I was not convinced serverless could not work, so I switched back to serverless, but to no avail. Eventually I just decided, it would be a nice experience to build the project using AWS Fargate and just switched over to it.
The overall structure for this project I came up with has a Lambda API Gateway as the entry point. The Lambda takes the input data and stores it on S3 then starts the Fargate task. The Fargate task takes the file from S3 runs the forecasting model on it, and pushes the output to S3.
Well, there is the story of how I got here. The next four posts will go through how I set up the project.