Hi, I'm Kristina Halbig.

A
Endlessly curious and adaptable data scientist who picks up new tools and techniques with ease, driven to uncover insights from complex data that drive positive change and meaningful impact.

About

Hello! I'm a data scientist with a unique blend of eight years of analytical chemistry experience and a passion for uncovering the stories hidden within data. As a lifelong learner, I'm always eager to expand my skillset and implement new techniques to tackle challenging problems.
My toolbox is well-stocked with Python, Pandas, Scikit-Learn, SQL, JupyterLab, a solid understanding of supervised/unsupervised learning, and natural language processing (NLP). I'm a tenacious problem-solver, driven to deliver high-quality work that makes a positive impact on others.
So, if you're looking for a data scientist who combines technical expertise with a touch of chemistry and a whole lot of determination, you've come to the right place. Let's dive into the data together and unravel its secrets!

  • Coding: Python, Pandas, Numpy, Scikit-Learn, Data Cleaning, Data Wrangling, API, spaCy, SQL, Keras, Tensorflow
  • Visualization: Matplotlib, Seaborn, Tableau
  • Libraries: NumPy, Pandas, OpenCV
  • Software: Github, JupyterLab, Slack, Streamlit
  • Machine Learning: Classification, Clustering, Decision Trees, Linear/Logistic Regression, Neural Networks, NLP, Predictive Modeling, Supervised/Unsupervised Learning

I'm seeking opportunities to apply my skills at a dynamic organization that values learning, embraces challenges, and supports career development and growth.

Experience

Data Science Immersive Fellow
  • Successfully completed a rigorous 12-week data science bootcamp program, totaling over 400 hours of instruction and hands-on learning.
  • Developed varios open-ended data science projects to explore different issues and improve data skills.
  • Built, evaluated, and optimized machine learning statistical models and classifications.
  • Developed strong problem-solving skills through hands-on practice with real-world datasets and case studies.
  • Tools: Python, Pandas, JupyterLab, Github, Scikit-learn, spaCy, Google Colab
March 2024 - June 2024 | Remote
Senior Analytical Chemist
  • Led data wrangling of over 10,000 data points per day, supporting 270 clients and generating $600k in monthly revenue.
  • Developed and implemented Tableau dashboards to visualize 500 geographic data points, resulting in improved understanding of business performance and strategic decision-making.
  • Implemented a QBench LIMS system to automate the upload and approval of data, ensuring proper handling of more than 5k data points per day and reducing errors to less than 5% (from 15%).
  • Trained analysts and lab supervisors on the operation of analytical chemistry equipment and data analysis, ensuring all trainees received scores of at least 4/5 on all KPIs and increasing productivity by 14% within the department.
  • Tools: Tableau, Microsoft Excel, QBench, Sciex OS (Data Analysis Software)
July 2018 - Nov 2024 | Pasadena, CA
Analytical Chemist
  • Directed analysis and data wrangling of diverse ingredient datasets across 100 product lines, processing data packets for about 20 samples daily. Led training, oversaw analytical instrument maintenance, and documented lab procedures.
  • Maintained compliance with industry standard regulations, overseeing the audit of 75 standard operating procedures to ensure compliance and assisting with the acquisition of an ISO 17025 certification.
  • Performed cost analysis, resulting in $400 daily savings through efficient purchasing.
  • Tools: Microsoft Excel, Agilent Masshunter (Data Analysis Software)
June 2015 - July 2018 | Burbank, CA

Projects

hack for la
Hack for LA Volunteer

Changing LA for the better through data science.

Accomplishments
  • Tools: Data Exploration, Dataset Evaluation
  • Review datasets from the LA Controller's website, organize them into an Excel sheet with key details, and identify high-value datasets for future initiatives.
knitting recommendation system
Knitting Pattern Recommender

A knitting recommender based on Ravelry API.

Accomplishments
  • Tools: Python, Scikit-learn, spaCy, Streamlit, API Scripting
  • Developed a knitting pattern recommender system using Ravelry API data, implemented in a Streamlit website that suggests similar patterns.
reddit classifier
Subreddit Post Classifier

A model that categorizes pen-related posts into pens or fountainpens subreddits.

Accomplishments
  • Tools: spaCy, API Scripting, Logistic Regression, Naive Bayes
  • Scraped posts (titles and content) from Reddit's pens and fountain pens subreddits. Trained a Naive Bayes classifier to identify the subreddit of each post based on text.
pdf chatbot
Multiple PDF Reader & Chatbot

Chat with multiple PDFs utilizing ChatGPT API.

Accomplishments
  • Tools: LangChain, OpenAI API, Streamlit, VS Code
  • LangChain, OpenAI API, and Streamlit were used to create an app enabling users to upload PDFs and chat about or query their content.
Screenshot of web app
Budget Airline Model

Logistic regression model predicting budget airlines from NTSB data.

Accomplishments
  • Tools: HTML, CSS, Bootstrap, Flask, SQLAlchemy, Postgresql, Python
  • Cleaned and preprocessed NTSB aviation accident dataset with missing and inconsistent data.
  • Used grid search to optimize model parameters and predict budget airlines.
Screenshot of  web app
Home Price Predictions

Predictive model for home prices based on various home features.

Accomplishments
  • Engineered features from a large dataset of home sales, including bathrooms, bedrooms, garage, and square feet. embeddings.
  • Tested and optimized model performance to avoid overfitting and improve price predictions.

Skills

Coding & Libraries

Python
Pandas
scikit-learn
Numpy
PostgreSQL
spaCy

Visualizations

matplotlib
Seaborn
Tableau

Software

Git
Jupyterlab
Slack
Streamlit

Education

General Assembly

Remote | March 2024 - June 2024

Program: Data Science Immersive Certification

    Relevant Coursework:

    • Exploratory Data Analysis
    • Machine Learning
    • Natural Language Processing
    • Deep Learning
    • Feature Engineering
    • Final Capstone Project & Presentation

California State University, Northridge University

Northridge, CA

Degree: Bachelor of Science in Chemistry
GPA: 3.51

    Relevant Courseworks:

    • Data Driven Experimentation
    • Analytical Techniques
    • Advanced Chemistry Studies
    • Hands-on Laboratory Techniques
    • Biochemistry Fundamentals

Contact