Available for senior roles

Stewyn
Chaudhary

Senior Data Engineer & Cloud Architect

8+ years building enterprise-scale data pipelines, AI-powered platforms, and BI automation systems. Specializing in AWS cloud infrastructure, real-time architectures, and generative AI integration at Amazon.

15K+
Daily Users Served
500+
Stores Supported
70%
Ticket Reduction
35%
Cloud Cost Savings

About Me

I'm a seasoned Data Engineer with 8+ years of experience building scalable data pipelines, BI platforms, and AI-integrated solutions across pharmaceutical and grocery retail at Amazon.

My expertise lies in transforming manual, error-prone processes into enterprise-scale automated systems. At Amazon's World Wide Grocery division, the platforms I've built serve over 15,000 daily users across 500+ stores — from real-time inventory pipelines to generative AI store assistants.

I hold an M.S. in Information Systems from UT Arlington and am an AWS Certified Solutions Architect. I'm passionate about FinOps, operational excellence, and bridging the gap between engineering complexity and business impact.

AWS Python PySpark Redshift Generative AI FinOps Data Pipelines QuickSight
🏢
Data Engineer II @ Amazon
World Wide Grocery Store — Apr 2025 to Present
📍
Based in Texas, USA
Open to remote & hybrid opportunities
🎓
M.S. Information Systems
University of Texas at Arlington • GPA 3.57
☁️
AWS Certified Solutions Architect
Associate Level
🔬
Published Research
Cloud security & encryption methodologies

Technical Skills

☁️ Cloud & Infrastructure
AWS Redshift Lambda S3 DynamoDB DAX Bedrock CDK CloudWatch SES SQS EMR Glue Kinesis Athena EC2 VPC IAM SNS Step Functions EventBridge API Gateway SageMaker Secrets Manager CloudFormation Docker
🔥 Big Data
Apache Spark PySpark Hadoop MapReduce Hive Kafka Sqoop AWS EMR HDFS
💻 Programming
Python SQL Scala R PySpark
📊 BI & Visualization
Amazon QuickSight Tableau Streamlit QuickSight APIs
🤖 AI / ML
AWS Bedrock Generative AI LLM Integration Prompt Engineering Vector Caching RAG Pipelines LangChain AWS SageMaker OpenAI APIs Embeddings Vector Databases AI Agents Fine-tuning DynamoDB AI Backends
⚙️ DevOps & Practices
Git Jenkins AWS CDK CI/CD Agile/Scrum FinOps IaC

Experience

🚀
Data Engineer II
Apr 2025 – Present
Amazon — World Wide Grocery Store, TX
  • Architect enterprise-scale data pipelines, AI-powered operational tools, and BI automation platforms supporting 500+ stores and 15,000+ daily users across the grocery division.
  • Lead development of generative AI store assistant (AWS Bedrock + DynamoDB + DAX) enabling store associates to query operational data via handheld devices.
  • Built self-service BI management platform (Streamlit + QuickSight APIs + Lambda) reducing engineering support tickets by 70%.
  • Reduced QuickSight platform monthly spend by 35% through usage pattern analysis and strategic resource consolidation.
  • Architected cross-account data integration solutions using AWS CDK, Docker, Lambda, and Redshift to unify siloed reporting systems.
Stack: AWS Bedrock, DynamoDB, DAX, Lambda, S3, Redshift, CDK, QuickSight, CloudWatch, SES • Python, Docker, Streamlit
🏗️
Data Engineer I
Nov 2022 – Apr 2025
Amazon — World Wide Grocery Store, TX
  • Built data infrastructure from the ground up; owned the food safety data vertical end-to-end, integrating third-party auditor and supplier data for real-time compliance visibility across 500+ stores.
  • Developed real-time inventory and order tracking pipelines with 30-minute refresh cycles (Lambda + S3 + SP-API + Redshift), critical for peak seasons like Thanksgiving.
  • Launched employee engagement gamification platform using DynamoDB, Redshift Serverless, and Zero-ETL integration.
  • Built automated BI reporting platform (QuickSight + Lambda) serving 15,000+ daily users across 500+ stores.
  • Implemented enterprise pipeline alerting framework (Lambda + SES + CloudWatch + SQS DLQs) shifting team from reactive firefighting to proactive issue resolution.
  • Built device health monitoring dashboards (PySpark + EMR + Redshift) for 10,000+ handheld devices fleet-wide.
Stack: Lambda, S3, Redshift, DynamoDB, SES, CloudWatch, SQS, EMR, PySpark, QuickSight, Zero-ETL • Python, SQL
☁️
Cloud Support Engineer
Mar 2021 – Nov 2022
Amazon Web Services (AWS)
  • Specialized in AWS Big Data services (EMR, Glue, Kinesis, Redshift, Athena) providing expert-level technical support to enterprise customers globally.
  • Resolved complex multi-service troubleshooting cases involving Hadoop ecosystem (HDFS, Hive, Spark, Tez) and cross-service architectural issues.
  • Mentored new hires and delivered internal training on AWS Big Data services; supported the AWS Re:Post community platform launch.
  • Tested and evaluated emerging AWS services pre-release, contributing feedback that shaped service GA readiness.
Stack: EMR, Glue, Kinesis, Redshift, Athena, S3, Lambda, EC2, VPC, IAM, CloudWatch, SageMaker • Hadoop ecosystem
🧬
Big Data Engineer
Oct 2018 – Mar 2021
ProKarma
  • Built end-to-end data pipelines for pharmaceutical data processing using the Hadoop ecosystem on accelerated agile timelines.
  • Developed real-time prescription benefit application handling high-volume transactions with Kafka-driven streaming pipelines.
  • Implemented batch and streaming ETL/ELT solutions (Spark, MapReduce, Hive, Sqoop) in collaboration with product owners and business analysts.
Stack: Spark, MapReduce, Hive, Kafka, Sqoop • Python, Oozie

Key Projects

🤖
Generative AI Store Assistant
An AI-powered assistant enabling 500+ store associates to query complex operational data through natural language via handheld devices. Built on AWS Bedrock with vector caching for ultra-low latency responses.
📈 Measurable productivity improvements across 500+ stores
AWS BedrockDynamoDBDAXLambdaPythonLLM
📊
Self-Service BI Management Platform
A Streamlit-based platform that empowered non-technical business users to manage their own QuickSight dashboard data refreshes, eliminating the need to file engineering tickets for routine operations.
📉 70% reduction in engineering support tickets
StreamlitQuickSight APIsLambdaPython
🏪
Real-Time Food Safety Pipeline
End-to-end food safety data vertical integrating third-party auditor and supplier data via SFTP and REST APIs. Powers real-time compliance visibility dashboards across 500+ grocery stores.
✅ Real-time compliance across 500+ stores
LambdaS3RedshiftREST APIsSFTPQuickSight
📱
Device Health Monitoring Dashboard
Analytics dashboards providing senior leadership visibility into performance and health metrics for 10,000+ handheld devices across the store fleet, built with PySpark and EMR for large-scale data processing.
🔍 10,000+ devices monitored in real-time
PySparkEMRRedshiftQuickSight
🏆
Employee Engagement Gamification Platform
A gamification system designed to drive associate productivity improvements on the store floor. Leveraged Zero-ETL integration between DynamoDB and Redshift Serverless for seamless real-time analytics.
🎮 Measurable productivity uplift for store associates
DynamoDBRedshift ServerlessZero-ETLPython
🔔
Enterprise Pipeline Alerting Framework
A comprehensive alerting and observability system that shifted the team from reactive firefighting to proactive issue resolution. Monitors all data pipeline health with smart DLQ-based failure recovery.
⚡ Proactive outage prevention across all pipelines
LambdaSESCloudWatchSQS DLQsPython

Blog

AI / GenAI Medium · Towards AI
We Gave AI a Brain. Now We're Giving It a Job.
AI is no longer just a research curiosity — it's being put to work in real enterprise environments. A deep dive into what it actually means to operationalize AI, the challenges of building production-grade AI systems, and what comes next.
Data Eng Medium
The Hidden Cost of Pipeline Coupling: How One 2 AM Incident Changed Everything
A candid account of a late-night production incident that exposed the real dangers of tightly coupled data pipelines — and the architectural lessons that came out of it. What the 2 AM alert revealed about system design.
Python Medium · Analytics Vidhya
Web Scraping Wiki Tables Using BeautifulSoup and Python
A practical, hands-on guide to extracting structured table data from Wikipedia using Python and BeautifulSoup. Covers parsing HTML, handling edge cases, and turning raw web data into clean, usable datasets.

✍️ Read more on my Medium profile →

Education & Certifications

Education

M.S. Information Systems
University of Texas at Arlington
GPA: 3.57 / 4.0  ·  Class of 2018
Wayne Watts Graduate Fellowship recipient · Graduate Student Senate Treasurer · Leadership Honors Program
B.E. Computer Science
University of Mumbai
GPA: 3.3 / 4.0  ·  Class of 2015

Certifications & Achievements

🏅
AWS Certified Solutions Architect – Associate
Amazon Web Services
🎤
Technical Workshop Facilitator
Led workshops on AWS Big Data services for internal teams and customers
📄
Published Research Papers
Cloud security and encryption methodologies
🏆
Wayne Watts Graduate Fellowship
Awarded for academic excellence — University of Texas at Arlington

Get in Touch

I'm always open to discussing interesting data engineering challenges, AI/ML projects, or senior engineering roles. Feel free to reach out!