Data analysis & machine learning

Collection of data analysis projects performed on publicly available datasets.

Tech stack:

  • python
  • jupyter notebook

Public transit passenger capacity

Built several models (SARIMAX, ARIMA, fbProphet forecasting model) to perform time series analysis predicting a number of transit passengers over a course of several months for a fake unicorn startup building a new transit system.

Source code

_config.yml

Wine quality

Analyzed wine properties and built a visualization of how they correlate with wine’s ranking. Wines with volatile acidity between 0.1-0.5 and alcohol content 10-15%, generally, had the highest ranking scores.

Source code

_config.yml

Boston housing

Built a model to predict mean value of a house in Boston by evaluating dataset, cleaning data, visualizing data, analyzing feature correlation and, finally, compiling a model. Analyzed ethnicity bias towards house value and location.

Source code

_config.yml

Loan predictor

Built a model predicting whether a loan would be approved depending on personal data. Analyzed bias points towards ethnicity and sex. Achieved 80% accuracy on test dataset.

Source code

_config.yml

Bigmart sales

Predicted sales of a bigmart store chain. Tasks performed: data cleaning, data visualization and analysis, feature extraction, model selection, training, validating, and evaluating compiled model on a test dataset.

Source code

_config.yml

NY Taxi

Analyzed correlation between weather forecast and number of daily taxi rides in NYC.

Source code

_config.yml