Piotr Parkitny

Logo

Explore my data-driven projects and analyses that cover everything from machine learning to data visualization.

View the Project on GitHub pparkitn/pparkitn

Data Jedi Master

Welcome to my data science playground on GitHub! I’m Piotr Parkitny, and I’ve been harnessing the power of data for over 15 years. Through advanced analytics, machine learning, and AI, I’ve helped businesses and teams drive strategic decisions and actionable insights. Here you’ll find some of my favorite projects and research.

Education

About Me

Projects

Here are some of the projects I’ve worked on:

  1. Face Emotion Detection
    Using deep learning to recognize human emotions from live video streams
    • Technologies : Python, Docker, Amazon AWS, W&B, MQTT
    • Dataset: FER-2013
    • Highlights: Training the DNN in the Cloud and deploying it to the Nvidia Jetson
    • Challenge: Optimizing the deployment for real-time performance on edge devices
    • Highlights: Achieved cloud training and deployed a deep neural network to edge hardware
    • Try it on Kaggle FaceEmotion

Logo

  1. MRI ANALYZER
    Helping medical professionals identify and diagnose cancer with AI.
    • Technologies : Python, PyTorch, Docker, Amazon AWS, W&B,
    • Challenge: End-to-end optimization of the AI pipeline for handling large medical image datasets
    • Highlights: Implemented a fully automated pipeline for MRI analysis
    • Try it on Kaggle

Logo Logo

  1. Attractiveness Bias
    Exploring whether attractiveness affects LinkedIn connection acceptance.
    • Technologies Used: R
    • Impact: A/B test conducted across LinkedIn profiles, providing insights into human behavioral bias in online connections
    • Challenge: Designing a randomized controlled trial to eliminate confounding factors.
    • Try it on Kaggle

Logo

  1. Coming Soon - Human Pose Estimation
    • Description: Tracking human pose for rock climbing style comparison
    • Technologies Used: Python / CV2

    Logo

  2. Counterfeit: Sentiment Improved Text Summarization
    Improving NLP text summarization by incorporating sentiment analysis.
    • Technologies Used: PyTorch

    Logo

Forecasting

  1. Forecasting using Prophet
    • Description: Forecasting the Canadian Price Index using Prophet
    • Try it on Kaggle

    Logo

  2. Coming Soon - Spark - Time Series Prediction
    • Description: Building large window feature set used for predicting future
    • Technologies Used: Spark / Python

Machine Learning

Here are some examples of machine learning using python:

  1. Clustering
    • Description: Clustering using Kmeans with PCA for graphing
    • Try it on Kaggle

Logo Logo

  1. Classification
    • Description: Classification with the final result implementation in SQL
    • Try it on Kaggle

Logo

  1. Model Performance Tracking
    • Description: Weights & Biases helps AI developers build better models faster. Quickly track experiments, version and iterate on datasets, evaluate model performance, reproduce models, and manage your ML workflows end-to-end.

Logo

  1. LDA VS PCA
    • Description: Comparison of LDA and PCA 2D projection of Iris dataset
    • Try it on Kaggle

Logo Logo

  1. Regression
    • Description: Using regression to predict next day stock market price based on historical price of stock prices
    • Try it on Kaggle

Logo Logo

  1. Classification - Random Forest
    • Description: Random Forest Classification with Cross-Validation / Hyperparameter Tuning / Stratified Fold & SHAP Analysis
    • Try it on Kaggle

Logo

Computer Vision

  1. Object Detection in Video
    • Description: Object Detection Using YoLo (You Only Look Once) in videos.
    • Try it on Kaggle

Logo Logo Logo

Experiment Tracking

  1. Cumulative Conversion
    • Description: Graphing performance of different groups over time
    • Try it on Kaggle

Logo Logo

Data Engineering Examples

  1. Centralized Repository of Medical Data
    • Description: Compiling large, up-to-date, secure and centralized repositories of health records is a major challenge facing medical data science. Achieving this will allow advanced ML/DL techniques to be applied to sufficiently large datasets for optimal performance and impact
    • Technologies Used: Spark, Flask, Kafka, Python, Cloudera, Presto

    Logo

  2. ML Execution - Windows Server
    • Description: How to execute your python notebook and log execution
    • Technologies Used: Python, papermill

Running Parameterized Jupyter Notebooks Installation

pip install papermill

Execute_ScoreModel.bat

@ECHO OFF 

set _my_datetime=%date%_%time%
set _my_datetime=%_my_datetime: =_%
set _my_datetime=%_my_datetime::=%
set _my_datetime=%_my_datetime:/=_%
set _my_datetime=%_my_datetime:.=_%

ECHO ======================================================================================================================
echo %_my_datetime%
ECHO ======================================================================================================================

C:\Users\piotr\AppData\Roaming\Python\Python311\Scripts\papermill.exe C:\Models\Score.ipynb -p par1 0 C:\Models\Log\Score_Log_%_my_datetime%.ipynb

When scheduling using task scheduler, add arguments to track the shell log execution C:\Models\log.txt 2>&1

  1. MLflow
    • Description: Build better models and generative AI apps on a unified, end-to-end,open source MLOps platform
    • Technologies Used: Python, MLflow

Charts / Graphs

  1. Line Graph
    • Seaborn Line Graph with annotations
    • Try it on Kaggle

Logo

  1. Line Graph - Dual Y Axis
    • Seaborn Line Graph with annotations and two y-axis
    • Try it on Kaggle

Logo

  1. HeatMap
    • Seaborn Heat Map with annotations
    • Try it on Kaggle

Logo

  1. BarGraph
    • Seaborn Bar Graph
    • Try it on Kaggle

Logo

  1. WordCloud
    • WordCloud using text from Canada Wiki Page
    • Try it on Kaggle

Logo

  1. Correlation
    • Correlation Matrix using Seaborn
    • Try it on Kaggle

Logo

  1. Association
    • Association Matrix using Seaborn (Includes Categorical Features)
    • Try it on Kaggle

Logo

  1. Important Features
    • Important Features in a Dataset that have impact on target
    • Try it on Kaggle

Logo

  1. Timeline of Events
    • Simple Timeline of events graph
    • Try it on Kaggle

Logo

  1. Canada Inflation & CPI
    • Graphing and Forecasting CPI and Inflation
    • Try it on Kaggle

Logo

Other

  1. GPU Support Setup - Local Computer
    • Setting up GPU support on Ubuntu for GPU Training
  2. DataSets - OpenML
    • Great source if looking for a dataset to work with
  3. AWS - Commands
    • General AWS Commands
  4. Spark
    • General Spark Python
  5. Jupyterlab
    • jupyter lab --port 3939 --allow-root --no-browser --NotebookApp.token='pass' --ip="0.0.0.0"
  6. MFFlow
    • mlflow server --host 127.0.0.1 --port 8080

Kaggle

  1. Kaggle - Free GPU Training
    • GPU Training for free on Kaggle
  2. Kaggle - Create a DataSet
    • Store Data on Kaggle for Free