Ragul Paramasivam

Senior Data Engineer

Seattle, WA

Education

Northeastern University
Master of Science in Data Architecture and Management
Seattle, Washington
Expected December 2025
  • Dean's Scholarship Recipient
  • Focus: Data Lake Architecture, Distributed Systems, ML Engineering
Amrita University
Bachelor of Technology in Computer Science
Coimbatore, India
June 2020

Technical Skills

Data Engineering

Apache Spark, Apache Kafka, Airflow, PySpark, Flink-style streaming, ETL/ELT Pipelines, Real-time & Batch Processing

Programming & Databases

Python, SQL, TypeScript, PostgreSQL, MongoDB, Neo4j, Snowflake, ElasticSearch, Redis

Cloud Platforms

AWS (EC2, ECS, Lambda, S3, Route 53, VPC), GCP (BigQuery, Dataproc, Cloud Storage), Azure (Databricks, Data Lake Storage)

Infrastructure & DevOps

Terraform, Docker, Kubernetes, Jenkins, GitLab CI/CD, Infrastructure as Code (IaC)

ML & Analytics

PyTorch, OpenCV, ML Pipeline Development, Time-Series Forecasting, Computer Vision, Power BI, Tableau, Streamlit

Architecture

Data Lake Design, Microservices, Distributed Systems, Data Modeling, Stream & Batch Architecture

Professional Experience

Roboteon
Software Intern
San Jose, California
May 2024 - December 2024
  • Architected AWS infrastructure using ECS clusters, custom VPCs, and Terraform IaC, automating provisioning and reducing deployment time by 40%
  • Built and maintained GitLab CI/CD pipelines for 15+ microservices and ML services
  • Integrated OpenCV-based 6D pose estimation service improving item identification accuracy by 25% and throughput by 30%
  • Engineered Neo4j-based graph database solution for warehouse traffic mapping, reducing average travel time by 20%
PurpleScape
Senior Technology Engineer
Chennai, India
April 2022 - May 2023
  • Architected and deployed Spark and Airflow-based data pipelines processing 5TB+ monthly data, reducing ML training time by 30%
  • Developed 8+ production-grade TypeScript microservices supporting 50,000+ MAU for NLP-driven analytics platform
  • Managed multi-cloud infrastructure maintaining 99.9% uptime for mission-critical analytics workloads
  • Created shared Jenkins CI/CD pipeline infrastructure used by 12+ services, reducing deployment cycle time by 60%
Technology Engineer
Chennai, India
November 2020 - March 2022
  • Built backend data pipelines powering NLP-based text autocomplete, improving query completion speed by 40%
  • Designed dialogue flow management system enabling context-aware, dynamic conversation routing
  • Implemented Terraform infrastructure-as-code scripts, reducing setup time from days to hours

Key Projects

Steam Marketplace Real-Time Price Forecasting Platform
Big Data, ML, Real-Time Analytics, Google Cloud Platform
  • Designed end-to-end big data and ML pipeline on GCP to analyze and predict pricing trends for 10,000+ marketplace items
  • Built streaming data ingestion with Apache Kafka processing 50GB+ daily transaction data and 5,000+ price updates per hour
  • Developed PyTorch LSTM time-series forecasting models achieving 87% prediction accuracy with <200ms inference latency
  • Automated infrastructure deployment with Terraform, reducing setup time by 70%
Dota 2 Tournament Analytics & Data Warehouse
Azure, ETL, Data Modeling, Data Warehousing
  • Built production ETL pipeline in Azure Databricks processing 50,000+ professional match records from OpenDota APIs
  • Designed star schema data warehouse in Azure Data Lake Storage optimized for analytical queries
  • Developed PySpark jobs for data cleansing, validation, and aggregation of multi-modal structured data
  • Created interactive Power BI dashboards analyzing competitive meta trends and strategic insights