Deep Learning Budget Forecasting
TensorFlow DNN predicting infrastructure budgets at 94% R² across 5000+ projects - deployed serverless on AWS Lambda
94%R² Score
Problem
Infrastructure project managers were estimating budgets manually, leading to frequent overruns. A rules-based estimator couldn't capture the non-linear interactions between project scale, region, material costs, and timeline. And 35% of historical records had incomplete data.
Solution
- Processed 30K+ historical records from S3 CSVs — Pandas merge on ProjectId, outlier detection, and missing-value imputation on 35% incomplete records.
- Designed a TensorFlow/Keras DNN with multi-hot encoding and custom preprocessing layers for cost normalization.
- Achieved 94% R² on holdout with 90/10 train-test-validation split.
- Deployed as a containerized AWS Lambda API (ECR) with sub-500ms latency and JWT authorizer (JWE decryption).
- S3 model storage with SSM Parameter Store versioning and weekly automated retraining via Airflow DAG on K8s with rollback safeguards.
- 60% infrastructure cost reduction vs server-based deployment.
System Flow
Data
S3 CSVs
Pandas Merge
Preprocessing
Feature Eng
Normalization
Model
TF/Keras DNN
Train-Test Split
Deployment
Lambda + ECR
JWT Authorizer
Retraining
Airflow DAG
K8s Pod
Architecture
- 01TensorFlow/Keras DNN with multi-hot encoding and custom preprocessing layers
- 0230K+ records from S3 CSVs - Pandas merge, outlier detection + imputation on 35% incomplete data
- 0390/10 train-test-validation split
- 04Containerized AWS Lambda API (ECR) - sub-500ms latency, JWT authorizer
- 05S3 model storage with SSM Parameter Store versioning + automated retraining
- 06Weekly automated retraining via Airflow DAG on K8s with rollback safeguards
Impact
- 94% R² on holdout set across 5000+ projects
- 40% reduction in infrastructure estimation errors
- Sub-500ms inference latency on Lambda - 1000+ monthly predictions
- 60% infrastructure cost reduction vs server-based deployment
Tech Stack
PythonTensorFlowKerasscikit-learnAWS LambdaS3SSMECREKSDockerJWTPostgreSQL