Somto A. Mbah
Get In Touch
Click to view contact info

Somto A. Mbah

Senior Data Engineer & Developer

Senior Data Engineer with 6+ years of expertise in Big Data Architecture and Full-Stack Development. Proven track record in designing scalable ETL/ELT pipelines using Apache Spark and Airflow, orchestrating containerized workloads with Kubernetes, and provisioning AWS Cloud Infrastructure (Redshift, EC2) via Terraform. Expert in Python, SQL, and React-based visualization to drive operational efficiency and business intelligence.

Skills

Data Engineering & Analytics

🐍Python
🐼Pandas
πŸ”’NumPy
πŸƒAirflow
πŸ“¨Kafka
πŸ”₯PyTorch
πŸ—ƒοΈSQL (PostgreSQL)
πŸƒNoSQL (MongoDB)
πŸ”„ETL/ELT Pipelines
🏒Data Warehousing
πŸ“Data Modeling
🐘Big Data (Spark, Hadoop)
πŸ“ŠPower BI
🏷️Google Tag Manager

Web Development

πŸ’»JavaScript
βš›οΈReact
β–²Next.js
πŸ“±Expo
🐘PHP
🟩Vue.js
πŸ’šNuxt.js
πŸ“ŠD3.js
🌐HTML5
🎨CSS3
πŸ”ŒRESTful APIs

Cloud & DevOps

☁️AWS (EC2, Redshift, Lambda)
🐳Docker
☸️Kubernetes
πŸ—οΈIaC (Terraform)
πŸš€CI/CD Pipelines (GitHub Actions)
πŸ™Git
βš™οΈAutomation

Methodologies

πŸƒAgile
πŸ“‹SDLC
πŸ”΄TDD (Test Driven Development)
πŸ§ͺUnit Testing (PyTest/Jest)
πŸ“‰Statistical Analysis
πŸ€–Prompt Engineering

Education

B.Sc. Computer Science

Minor in Statistics

University of Manitoba

2014 - 2018

Experience

Full Stack Developer & Senior Data Engineer

RK Publishing

Jun 2021 – Present

  • β–ΉCustom CRM Engineering: Architected a scalable Python CRM, implementing Redis caching to slash query latency by 20% and drive a 35% increase in sales team operational throughput.
  • β–ΉData Migration & Architecture: Containerized Apache Spark (PySpark) ETL pipelines using Docker and orchestrated CI/CD workflows via GitHub Actions and Airflow to migrate TB-scale datasets into AWS Redshift, ensuring GDPR compliance and 98% data integrity.
  • β–ΉStrategic Technical Leadership: Directed technical strategy for system enhancements, serving as the subject matter expert on software architecture; drove a 40% improvement in user adoption and a 25% reduction in reported bugs through proactive root-cause analysis.
  • β–ΉPerformance Optimization: Established a robust CI/CD pipeline using GitHub Actions to trigger automated PyTest and Jest suites, replacing manual QA and reducing production issues by 30% while adhering to GDPR data privacy standards.
  • β–ΉClient Solutions: Partnered with key accounts to resolve complex technical challenges, resulting in a 15% increase in client satisfaction scores.

Lead Real-time Data Analyst

24-7 Intouch

Jan 2019 – Apr 2021

  • β–ΉAnalytics Operations: Directed real-time data analytics operations, leveraging Power BI and SQL to optimize resource allocation, driving a 20% surge in team productivity and 5% reduction in operational downtime.
  • β–ΉProcess Automation: Engineered automated ETL ingestion workflows to deprecate manual reporting, increasing data accuracy by 25% and ensuring 99.9% uptime for C-suite executive dashboards.
  • β–ΉPredictive Modeling: Optimized workforce planning by strategically analyzing historical data trends, resulting in a 30% increase in Service Level Objective (SLO) attainment.
  • β–ΉTeam Leadership: Led technical training and onboarding initiatives, reducing time-to-productivity for new analysts by 40%.

Projects

WE Properties App

WE Properties App

A luxury real-estate application concept featuring a reactive 'alive' interface, advanced filtering, and a premium golden aesthetic.

Next.jsFramer MotionTailwindCSSUX Design
GitHub
OpenMetric ETL Platform

OpenMetric ETL Platform

End-to-end data pipeline built with Python and Airflow, featuring automated quality validation and interactive React/D3.js dashboards for cryptocurrency sentiment analysis.

PythonAirflowAWS S3dbtGreat ExpectationsDocker
GitHub
Bonjour Book

Bonjour Book

An interactive bilingual digital book for children featuring custom SVG illustrations, text-to-speech narration, and word highlighting.

ReactTypescriptWeb Speech APISVG
GitHub
Bellabeat Case Study

Bellabeat Case Study

Data analysis of Fitbit fitness trackers to identify trends and inform marketing strategies for a health-focused tech company.

PythonData AnalysisPandasMatplotlib
View on Kaggle

Β© 2025 Somto A. Mbah. All rights reserved.