Tirtha's projects and code

Selected projects and code repo

Some of my selected open-source projects and code repo are listed here. Clicking on the heading will straight take you to the respective Github repo. All of them have permissive licenses like MIT or BSD-2. Please feel free to fork and leave a star if you like it!

Python-Machine-Learning

Practice and tutorial-style notebooks covering wide variety of machine learning techniques.

Regression

Classificiation

Clustering

Synthetic data generation for machine learning

Learning and complexity curve generation

How to mix object-oriented programming into machine learning

Deployment of ML models using microservice web framework

Here is the detailed documentation.

Deep-learning-with-Python

Collection of a variety of Deep Learning (DL) code examples, tutorial-style Jupyter notebooks, and projects. Quite a few of the Jupyter notebooks are built on Google Colab and may employ special functions exclusive to Google Colab (for example uploading data or pulling data directly from a remote repo using standard Linux commands).

Deep learning vs. linear model

Demo of a general-purpose regression module

Simple Conv Net

Using Keras `ImageDataGenerator` and other utilities

Transfer learning

Activation maps

Keras Callbacks using ResNet

Simple RNN

Text generation using LSTM

Bi-directional LSTM for sentiment classification

Generative adversarial network (GAN) using simple 1-D algebraic function

Scikit-learn wrapper for Keras

Here is the detailed documentation.

Pydbgen

PyPI - Status PyPI

This is a lightweight Python library for generating random database tables. Useful for beginners in data science when they want to create SQL database tables with synthetic data for practicing machine learning and data extraction algorithms. It can generate Pandas DataFrame, MySQL and SQLite tables, and Excel files with random but contextual data such as name, address, city, zip code, telephone number, birthday, license plate, organization, job title, etc.

pip install pydbgen

Read the docs here.

'MLR' - Linear Regression Library with Statistical Modeling

PyPI - Status PyPI

This is A lightweight, easy-to-use Python package that combines the scikit-learn-like simple API with the power of statistical inference tests, visual residual analysis, outlier visualization, multicollinearity test, found in packages like statsmodels and R language.

pip install mlr

Read the docs here.

UCI-ML API

This is a simple and intuitive API written in Python to interface with the famous UC Irvine Machine Learning repository. It can help a user easily search and download relevant datasets or selectively choose a dataset based on its size or machine learning task category (regression or classification or clustering etc.).

Here is the detailed documentation.

Design-of-Experiment-Python

Design of Experiment (DOE) is an important activity for any scientist, engineer, or statistician planning to conduct experimental analysis, especially in this age of rapidly expanding field of data science and associated statistical modeling and machine learning. This set of codes is a collection of functions which wrap around the core packages (pyDOE and DiversiPy) and generate DOE matrices from an arbitrary range of input variables and save on the local disk as CSV or Excel file. It covers factorial designs, response-surface methods (RSM), and Latin Hypercube sampling.

Read the detailed documentation here.

doepy

Read the Docs (version) PyPI - Status PyPI

This is just the formal release of the above mentioned design-of-experiment project on the PyPi repository for easy install.

pip install doepy

Read the docs here.

Synthetic data generation for machine learning

Various methods for generating synthetic data for data science and ML.

Scikit-learn data generation (regression/classification/clustering) methods

Random regression and classification problem generation from symbolic expressions

Synthesizing time series

Generating Gaussian mixture model data

Statistics and mathematical computing with Python

General statistics, mathematical programming, and numerical/sceintific computing scripts and notebooks in Python.

Set algebra basics

Permutations and combinations

Discrete probability distributions

How to do linear regression in 8 ways

R-style statistical functions using Python

Statistical diagnostics on a linear regression model

Recognizing nature of a statistical distribution from its histogram using deep learning

Read the description here.

Apache Spark with Python

Notebooks on Apache Spark fundamentals (using PySpark) - RDD and Dataframe, and machine learning with Spark (MLib).

Read the description here.