Stewyn Chaudhary — Data Engineer

// who I am

About Me

I'm a seasoned Data Engineer with 8+ years of experience building scalable data pipelines, BI platforms, and AI-integrated solutions across pharmaceutical and grocery retail at Amazon.

My expertise lies in transforming manual, error-prone processes into enterprise-scale automated systems. At Amazon's World Wide Grocery division, the platforms I've built serve over 15,000 daily users across 500+ stores — from real-time inventory pipelines to generative AI store assistants.

I hold an M.S. in Information Systems from UT Arlington and am an AWS Certified Solutions Architect. I'm passionate about FinOps, operational excellence, and bridging the gap between engineering complexity and business impact.

AWS Python PySpark Redshift Generative AI FinOps Data Pipelines QuickSight

🏢
Data Engineer II @ Amazon
World Wide Grocery Store — Apr 2025 to Present

📍
Based in Texas, USA
Open to remote & hybrid opportunities

🎓
M.S. Information Systems
University of Texas at Arlington • GPA 3.57

☁️
AWS Certified Solutions Architect
Associate Level

🔬
Published Research
Cloud security & encryption methodologies

// what I work with

Technical Skills

☁️ Cloud & Infrastructure

AWS Redshift Lambda S3 DynamoDB DAX Bedrock CDK CloudWatch SES SQS EMR Glue Kinesis Athena EC2 VPC IAM SNS Step Functions EventBridge API Gateway SageMaker Secrets Manager CloudFormation Docker

🔥 Big Data

Apache Spark PySpark Hadoop MapReduce Hive Kafka Sqoop AWS EMR HDFS

💻 Programming

Python SQL Scala R PySpark

📊 BI & Visualization

Amazon QuickSight Tableau Streamlit QuickSight APIs

🤖 AI / ML

AWS Bedrock Generative AI LLM Integration Prompt Engineering Vector Caching RAG Pipelines LangChain AWS SageMaker OpenAI APIs Embeddings Vector Databases AI Agents Fine-tuning DynamoDB AI Backends

⚙️ DevOps & Practices

Git Jenkins AWS CDK CI/CD Agile/Scrum FinOps IaC

// where I've worked

Experience

🚀

Data Engineer II

Apr 2025 – Present

Amazon — World Wide Grocery Store, TX

Architect enterprise-scale data pipelines, AI-powered operational tools, and BI automation platforms supporting 500+ stores and 15,000+ daily users across the grocery division.
Lead development of generative AI store assistant (AWS Bedrock + DynamoDB + DAX) enabling store associates to query operational data via handheld devices.
Built self-service BI management platform (Streamlit + QuickSight APIs + Lambda) reducing engineering support tickets by 70%.
Reduced QuickSight platform monthly spend by 35% through usage pattern analysis and strategic resource consolidation.
Architected cross-account data integration solutions using AWS CDK, Docker, Lambda, and Redshift to unify siloed reporting systems.

Stack: AWS Bedrock, DynamoDB, DAX, Lambda, S3, Redshift, CDK, QuickSight, CloudWatch, SES • Python, Docker, Streamlit

🏗️

Data Engineer I

Nov 2022 – Apr 2025

Amazon — World Wide Grocery Store, TX

Built data infrastructure from the ground up; owned the food safety data vertical end-to-end, integrating third-party auditor and supplier data for real-time compliance visibility across 500+ stores.
Developed real-time inventory and order tracking pipelines with 30-minute refresh cycles (Lambda + S3 + SP-API + Redshift), critical for peak seasons like Thanksgiving.
Launched employee engagement gamification platform using DynamoDB, Redshift Serverless, and Zero-ETL integration.
Built automated BI reporting platform (QuickSight + Lambda) serving 15,000+ daily users across 500+ stores.
Implemented enterprise pipeline alerting framework (Lambda + SES + CloudWatch + SQS DLQs) shifting team from reactive firefighting to proactive issue resolution.
Built device health monitoring dashboards (PySpark + EMR + Redshift) for 10,000+ handheld devices fleet-wide.

Stack: Lambda, S3, Redshift, DynamoDB, SES, CloudWatch, SQS, EMR, PySpark, QuickSight, Zero-ETL • Python, SQL

☁️

Cloud Support Engineer

Mar 2021 – Nov 2022

Amazon Web Services (AWS)

Specialized in AWS Big Data services (EMR, Glue, Kinesis, Redshift, Athena) providing expert-level technical support to enterprise customers globally.
Resolved complex multi-service troubleshooting cases involving Hadoop ecosystem (HDFS, Hive, Spark, Tez) and cross-service architectural issues.
Mentored new hires and delivered internal training on AWS Big Data services; supported the AWS Re:Post community platform launch.
Tested and evaluated emerging AWS services pre-release, contributing feedback that shaped service GA readiness.

Stack: EMR, Glue, Kinesis, Redshift, Athena, S3, Lambda, EC2, VPC, IAM, CloudWatch, SageMaker • Hadoop ecosystem

🧬

Big Data Engineer

Oct 2018 – Mar 2021

ProKarma

Built end-to-end data pipelines for pharmaceutical data processing using the Hadoop ecosystem on accelerated agile timelines.
Developed real-time prescription benefit application handling high-volume transactions with Kafka-driven streaming pipelines.
Implemented batch and streaming ETL/ELT solutions (Spark, MapReduce, Hive, Sqoop) in collaboration with product owners and business analysts.

Stack: Spark, MapReduce, Hive, Kafka, Sqoop • Python, Oozie

// what I've built

Key Projects

🤖

Generative AI Store Assistant

An AI-powered assistant enabling 500+ store associates to query complex operational data through natural language via handheld devices. Built on AWS Bedrock with vector caching for ultra-low latency responses.

📈 Measurable productivity improvements across 500+ stores

AWS BedrockDynamoDBDAXLambdaPythonLLM

📊

Self-Service BI Management Platform

A Streamlit-based platform that empowered non-technical business users to manage their own QuickSight dashboard data refreshes, eliminating the need to file engineering tickets for routine operations.

📉 70% reduction in engineering support tickets

StreamlitQuickSight APIsLambdaPython

🏪

Real-Time Food Safety Pipeline

End-to-end food safety data vertical integrating third-party auditor and supplier data via SFTP and REST APIs. Powers real-time compliance visibility dashboards across 500+ grocery stores.

✅ Real-time compliance across 500+ stores

LambdaS3RedshiftREST APIsSFTPQuickSight

📱

Device Health Monitoring Dashboard

Analytics dashboards providing senior leadership visibility into performance and health metrics for 10,000+ handheld devices across the store fleet, built with PySpark and EMR for large-scale data processing.

🔍 10,000+ devices monitored in real-time

PySparkEMRRedshiftQuickSight

🏆

Employee Engagement Gamification Platform

A gamification system designed to drive associate productivity improvements on the store floor. Leveraged Zero-ETL integration between DynamoDB and Redshift Serverless for seamless real-time analytics.

🎮 Measurable productivity uplift for store associates

DynamoDBRedshift ServerlessZero-ETLPython

🔔

Enterprise Pipeline Alerting Framework

A comprehensive alerting and observability system that shifted the team from reactive firefighting to proactive issue resolution. Monitors all data pipeline health with smart DLQ-based failure recovery.

⚡ Proactive outage prevention across all pipelines

LambdaSESCloudWatchSQS DLQsPython

// thoughts & insights

Blog

AI / GenAI Medium · Towards AI

We Gave AI a Brain. Now We're Giving It a Job.

AI is no longer just a research curiosity — it's being put to work in real enterprise environments. A deep dive into what it actually means to operationalize AI, the challenges of building production-grade AI systems, and what comes next.

Data Eng Medium

The Hidden Cost of Pipeline Coupling: How One 2 AM Incident Changed Everything

A candid account of a late-night production incident that exposed the real dangers of tightly coupled data pipelines — and the architectural lessons that came out of it. What the 2 AM alert revealed about system design.

Python Medium · Analytics Vidhya

Web Scraping Wiki Tables Using BeautifulSoup and Python

A practical, hands-on guide to extracting structured table data from Wikipedia using Python and BeautifulSoup. Covers parsing HTML, handling edge cases, and turning raw web data into clean, usable datasets.

✍️ Read more on my Medium profile →

// academics & credentials

Education & Certifications

Education

M.S. Information Systems

University of Texas at Arlington

GPA: 3.57 / 4.0 · Class of 2018

Wayne Watts Graduate Fellowship recipient · Graduate Student Senate Treasurer · Leadership Honors Program

B.E. Computer Science

University of Mumbai

GPA: 3.3 / 4.0 · Class of 2015

Certifications & Achievements

🏅

AWS Certified Solutions Architect – Associate

Amazon Web Services

🎤

Technical Workshop Facilitator

Led workshops on AWS Big Data services for internal teams and customers

📄

Published Research Papers

Cloud security and encryption methodologies

🏆

Wayne Watts Graduate Fellowship

Awarded for academic excellence — University of Texas at Arlington

StewynChaudhary