~/stephen-situ/portfolio
github
leetcode
linkedin
ML/AI Data Engineer
Stephen
Situ
Machine Learning
Data Engineering
Cloud Architecture
Generative AI
MLOps
LLM Systems
▸
location:
Calgary, AB, Canada
01
Certifications
☁
Cloud Platforms
6 certs
AWS Solutions Architect – Associate
↗
AWS Machine Learning – Specialty
↗
Azure Data Engineer Associate
↗
Microsoft Fabric Analytics Engineer
↗
Microsoft Fabric Data Engineer
↗
Snowflake SnowPro Core
↗
⬡
DevOps & Infrastructure
3 certs
CKAD: Certified Kubernetes App Developer
↗
HashiCorp Terraform Associate (003)
↗
GitHub Actions
↗
⚙
Data Engineering
5 certs
Databricks Data Engineer Professional
↗
Databricks Apache Spark 3.0 Developer
↗
Apache Airflow Fundamentals
↗
Airflow DAG Authoring
↗
IBM Data Engineering Foundations
↗
◈
Machine Learning & Generative AI
4 certs
Databricks ML Professional
↗
Databricks Generative AI Engineer
↗
TensorFlow Developer Certificate
↗
LangGraph Foundations
↗
◉
Analytics
2 certs
Google Data Analytics Professional
↗
Google Advanced Data Analytics
↗
02
Achievements
🥉
Zeroing Methane Emissions Datathon — 3rd Place
CDC-Climate Data Catalysts
Website ↗
Video ↗
2023
🥈
YYC Hacks Calgary — 2nd Place
YYC Winter Project
Event ↗
Project ↗
2023
🏆
YYC Hacks Calgary
Calgary Newcomer Statistics Dashboard
Event ↗
Live App ↗
GitHub ↗
2024
03
Competitive Programming
⚡
wssitu
LeetCode Profile ↗
04
Generative AI / LLM
01
Using LangGraph to Create Agentic Graph Workflows for Text Classification & Sentiment Analysis
→
02
Local RAG System with PyTorch and Ollama
→
03
Math MCP Server & ReAct Agent — Demonstrating Tool Decoupling
→
05
Full Stack Development
01
Calculator App — React Frontend + Java Spring Boot Backend
Frontend
Backend
Live Site
Swagger UI
02
Golang Linear vs Concurrent Programming Benchmark
→
06
Data Analytics / Data Science / ML
01
Video Game Sales Analysis
→
02
Gender Prediction Using Body Measurements — Random Forest
→
03
Predicting Voting Behaviour Using XGBoost
→
04
Python Data Cleaning — Car Sales Data
→
05
Visualizing Canada Election Data
→
06
Deep Learning Regression for Air Quality — TensorFlow/Keras
→
07
Commodity Price Time Series — Seasonal Naive, ETS & ARIMA
→
08
PCA on Credit Card Transaction Data
→
09
SQL Querying Practice on Retail Sales Data
→
10
Power BI Dashboard on Financial Data
→
11
A/B Testing & Hypothesis Testing — Fast Food Marketing Campaign
→
12
K-Means Clustering on Wine Data
→
13
Twitter API — Word Frequency & Word Cloud
→
14
Cohort Analysis & Customer Churn — Online Retail
→
15
Monte Carlo Simulation — Tesla & Apple Stock Price
→
16
Pumpkin Seed Classification — LDA, QDA, SVM
→
17
Pumpkin Seed Classification — KNN
→
18
Logistic Regression — Breast Cancer Diagnosis
→
19
IBM Cognos Analytics Dashboard for Retail Sales
Part 1
Part 2
Part 3
20
Animal Multi-class Classification Neural Network — TensorFlow & TensorBoard
Notebook
TB 1
TB 2
TB 3
21
Tweet NLP Classification — Naive Bayes, Dense NN, LSTM, GRU, BiLSTM, Conv1D, TF Hub
Notebook
Embedding Projector
22
Time Series Modeling — Temperature & Climate (FF NN, LSTM, Conv1D, Uni/Multivariate)
→
23
Methane Emissions Computer Vision Model — CNN (2023 Datathon)
→
07
Data Engineering
01
Webscraping Real Estate Data — Python BeautifulSoup
→
02
ETL Process on CSV, JSON & XML Files
→
03
Apache Spark DataFrames — PySpark & Spark SQL
→
04
Scikit-Learn Pipelines & Transformers for Data Preprocessing
→
05
Advanced SQL Querying on IBM DB2 with SQLAlchemy
→
06
CRUD Operations on MongoDB with pymongo
→
07
CRUD Operations on DataStax Astra Cassandra
→
08
Scheduling Cronjobs with Linux Shell & Bash
Notebook
Cron
Stock 2
Stock 3
Stock 4
09
Apache Airflow for Data Orchestration & Workflow Scheduling
1
2
3
4
5
6
7
10
Apache Kafka Streaming — Real-Time Data Pipelines
Producer
Consumer
11
FastAPI Deployment — Breast Cancer Logistic Regression on render.com
→
12
Scalable Big Data Pipelines — Spark Pipelines & TF Data API
→
13
Data Modeling — ERD, Fact/Dimension Tables, Star & Snowflake Schemas
Star Schema
Snowflake Schema
14
MLflow — Experiment Tracking & Deployment for Elastic Net Regression
Notebook
1
2
3
4
5
6
08
DevOps
01
Rental Data Webscraper & ML Prediction — End-to-End FastAPI App in Docker
→