PYCARET: An open-source, low-code machine learning library in Python

PyCaret is an open-source, low-code machine learning library in Python that aims to reduce the hypothesis to insight cycle time in an ML experiment. It enables data scientists to perform end-to-end experiments quickly and efficiently. With PyCaret, you spend less time coding and more time on analysis. In comparison with the other open-source machine learning libraries, PyCaret is an alternate low-code library that can be used to perform complex machine learning tasks with only a few lines of code. PyCaret is simple and easy to use.

Link: https://pycaret.org/
Youtube Link: https://www.youtube.com/channel/UCxA1YTYJ9BEeo50lxyI_B3g

View Resource

RECIPE

RECIPE (REsilient ClassifIcation Pipeline Evolution) is an AutoML framework based on a grammar-based genetic programming algorithm that builds customized classification pipelines. The framework is flexible enough to receive different grammars and can be easily extended to other machine learning tasks. It overcomes the drawbacks of previous evolutionary-based frameworks, such as generating invalid individuals, and organizes a high number of possible suitable data pre-processing and classification methods into a grammar.

Link: https://laic-ufmg.github.io/Recipe/docs/

View Resource

Simple, Transparent, End-to-end Automated Machine Learning Pipeline (STREAMLINE)

STREAMLINE is an end-to-end automated machine learning (AutoML) pipeline that empowers anyone to easily run, interpret, and apply a rigorous and customizable analysis for data mining or predictive modeling. Notably, this tool is currently limited to supervised learning on tabular, binary classification data but will be expanded as our development continues. The development of this pipeline focused on (1) overall automation, (2) avoiding and detecting sources of bias, (3) optimizing modeling performance, (4) ensuring complete reproducibility (under certain STREAMLINE parameter settings), (5) capturing complex associations in data (e.g. feature interactions), and (6) enhancing interpretability of output. Overall, the goal of this pipeline is to provide a transparent framework to learn from data as well as identify the strengths and weaknesses of ML modeling algorithms or other AutoML algorithms.

 

View Resource

TransmogrifAI

TransmogrifAI is an end-to-end Auto-ML library for structured data written in Scala that runs on top of Apache Spark, an open-source unified analytics engine for large-scale data processing. It was developed with a focus on accelerating machine learning developer productivity through machine learning automation, and an API that enforces compile-time type-safety, modularity, and reuse.

For automation, TransmogrifAI has numerous Transformers and Estimators that make use of Feature abstractions to automate feature engineering, feature validation, and model selection.

For modularity and reuse, TransmogrifAI enforces a strict separation between ML workflow definitions and data manipulation, ensuring that code written using TransmogrifAI is inherently modular and reusable.

For compile-time type-safety, machine learning workflows built using TransmogrifAI are strongly typed. This means developers get to enjoy the many benefits of compile-time type safety, including code completion during development and fewer runtime errors.

For transparency, model insights leverage stored feature metadata and lineage to help debug models while providing insights to the end user, making machine learning models less of a black box.

Link: https://transmogrif.ai/

View Resource

Tree-based Pipeline Optimization Tool (TPOT)

Consider TPOT your Data Science Assistant. TPOT is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.TPOT will automate the most tedious part of machine learning by intelligently exploring thousands of possible pipelines to find the best one for your data. Once TPOT is finished searching (or you get tired of waiting), it provides you with the Python code for the best pipeline it found so you can tinker with the pipeline from there. TPOT is built on top of scikit-learn, so all of the code it generates should look familiar… if you’re familiar with scikit-learn, anyway.

 

View Resource

Xcessiv

Xcessiv is an open-source, web-based application developed using Python and Javascript for automating and visualizing the model selection process, hyperparameter tuning, and feature extraction in machine learning. It provides a user-friendly interface for managing and executing experiments across multiple algorithms and datasets. Xcessiv employs models from the Scikit-learn package, supports parallel hyperparameter searches using Bayesian optimization, and enables easy management and comparison of hundreds of different model-hyperparameter combinations, easy stack ensemble creation, and automated ensemble construction. It can also export created stacked ensembles as a standalone Python file to support multiple levels of stacking.

Link: https://xcessiv.readthedocs.io/en/stable/

View Resource