Stephen Situ - Full Stack Data Professional

Certifications

Leetcode Profile

  • Leetcode Profile
  • Achievements

    Generative AI / LLM

    AI Project 1 – Using Langgraph to Create Agentic Graph Workflows for Text Classification and Sentiment Analysis

    AI Project 2 – Creating a Local RAG (Retrieval Augmented Generation) System with PyTorch and Ollama

    AI Project 3 – Creating a Math MCP server and ReAct agent to demonstrate tool decoupling

    Full Stack Development

    Full Stack Project 1 – Calculator App with a JavaScript/React front end and a Java Spring Boot back end. Below are the relevant repositories and live links:

    Full Stack Project 2 - Golang Linear vs Concurrent Programming Benchmark

    Data Analytics/Data Science/Machine Learning

    Data Project 1 - Video Game Sales

    Data Project 2 - Gender Prediction Using Body Measurements with Random forest algorithm

    Data Project 3 - Predicting Voting Behaviour Using XGBoost

    Data Project 4 - Python Data Cleaning Car Sales Data

    Data Project 5 - Visualizing Canada Election Data

    Data Project 6 - Using Tensorflow/Keras to implement Deep Learning/Neural Network Regression model for Air Quality

    Data Project 7 - Commodity Price Time Series Analysis, Correlation, and Forecasting using Seasonal Naive method, ETS method, and Arima Method

    Data Project 8 - PCA (Principal Component Analysis) on Credit Card Transaction Data

    Data Project 9 - SQL Querying Practice on Retail Sales Data

    Data Project 10 - Power BI Dashboard on Financial Data

    Data Project 11 - A/B Testing & Hypothesis Testing on Fast Food Marketing Campaign

    Data Project 12 - K-means Clustering on Wine Data

    Data Project 13 - Using Twitter API to find Word Frequency and create a Word Cloud

    Data Project 14 - Cohort Analysis and Customer Churn On Online Retail Sales Data

    Data Project 15 - Monte Carlo Simulation for Tesla and Apple Stock Price

    Data Project 16 - Pumpkin Seed Classification Using LDA (Linear Discriminant Analysis), QDA (Quadratic Discriminant Analysis), and SVM (Support Vector Machine)

    Data Project 17 - Pumpkin Seed Classification Using KNN (K-Nearest Neighbors)

    Data Project 18 - Logistic Regression for Binary Classification of Breast Cancer Diagnosis

    Data Project 19 - IBM Cognos Analytics Dashboard for Retail Sales Part 1 Part 2 Part 3

    Data Project 20 - Animal Multi-class Classification Neural Network Using Tensorflow and TensorBoard Part 1 Part 2 Part 3 Part 4

    Data Project 21 - Investigating Tweet Data NLP (Natural Language Processing) models for classification using Naive Bayes, Simple Dense NN, LSTM NN, GRU NN, Bidirectional NN, Conv 1D NN, and TF Hub models along with speed/score tradeoff and visualizations in Tensorflow Embedding Projector Part 1 Part 2

    Data Project 22 - Time Series Modeling on Temperature and Climate Data including Correlation, Windows, and Horizons, FF NNs, LSTM NNs, Conv1D NNs, and Univariate/Multivariate Modeling

    Data Project 23 - Methane Emissions Computer Vision Model using Convolutional Neural Network - Part of 2023 Zeroing Methane Emissions Datathon

    Data Engineering

    Data Project 1 - Webscraping Real Estate Data with Python BeautifulSoup Library

    Data Project 2 - Conducting ETL (Extract-Transform-Load) Process on csv, json, and xml files

    Data Project 3 - Working with Apache Spark DataFrames using PySpark and Spark SQL

    Data Project 4 - Using Sci-Kit Learn Pipelines and Transformers for Data pre-processing

    Data Project 5 - Advanced SQL Querying on IBM DB2 DataBase with SQLAlchemy

    Data Project 6 - Using CRUD Operations on a MongoDB Database with pymongo in Python

    Data Project 7 - Using CRUD Operations on a DataStax Astra Cassandra based database in Python

    Data Project 8 - Scheduling Cronjobs with Linux Shell & Bash to automate python scripts Part 1 Part 2 Part 3 Part 4 Part 5

    Data Project 9 - Using Apache Airflow for Data Orchestration and scheduling/monitoring workflows Part 1 Part 2 Part 3 Part 4 Part 5 Part 6 Part 7

    Data Project 10 - Working with a Avien Apache Kafka service for streaming data and real-time data pipelines Part 1 Part 2

    Data Project 11 - Deploying Beast Cancer Logistic Regression Model using FASTAPI for API endpoints and render.com as a hosting service

    Data Project 12 - Creating Scalable and Efficient Data Pipelines for "Big Data" using Spark Pipelines and the Tensorflow tf.data.Dataset API

    Data Project 13 - Data Modeling using ERD (Entity Relationship Diagrams) for Olympic Medal Data using Fact/Dimension Tables in Star/Snowflake Schemas Part 1 Part 2

    Data Project 14 - MLflow for Machine Learning Experiment/Model Tracking and Deployment for Elastic Net Linear Regression Part 1 Part 2 Part 3 Part 4 Part 5 Part 6 Part 7

    DevOps

    Devops Project 1 - Rental Data Webscraper/Model Prediction End to End Fast API application deployed using docker