ALIRO: AI Driven Data Science

ALIRO is an easy-to-use data science assistant. It allows researchers without machine learning or coding expertise to run supervised machine learning analysis through a clean web interface. It provides results visualization and reproducible scripts so that the analysis can be taken anywhere. And, it has an AI assistant that can choose the analysis to run for you. Dataset profiles are generated and added to a knowledgebase as experiments are run, and the AI assistant learns from this to give more informed recommendations as it is used. Aliro comes with an initial knowledgebase generated from the PMLB benchmark suite.

 

View Resource

Auto_ML

Auto_ML is a Python-based library designed to automate the whole machine learning process. It focuses on simplifying the model selection, feature engineering, hyperparameter tuning, data formatting, robust scaling and analytics. It supports binary and multiclass classification, regression, linear-model-esque interpretation from non-linear models, feature learning, and categorical ensembling. The package includes traditional models, as well as deep learning models, gradient boost models, and catboost models.

Link: https://pypi.org/project/auto_ml/

View Resource

Auto-Gluon: AutoML for Image, Text, Time Series, and Tabular Data

AutoGluon automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications. With just a few lines of code, you can train and deploy high-accuracy machine learning and deep learning models on image, text, time series, and tabular data.

(1) AutoGluon-Tabular is an AutoML framework for tabular data. It succeeds by ensembling multiple models and stacking them in multiple layers.
(2) AutoGluon-MultiModal is a deep learning model zoo of model zoos that can automatically build state-of-the-art deep learning models for inputs including images, text, and tabular data.
(3) AutoGluon-TimeSeries is designed for probabilistic time series forecasting. It combines both conventional statistical models, machine-learning based forecasting approaches, and ensembling techniques.

Link: https://auto.gluon.ai/stable/index.html
Youtube Link: https://www.youtube.com/watch?v=5tvp_Ihgnuk

View Resource

Auto-Keras: An AutoML system based on Keras

Auto-Keras is developed by DATA Lab at Texas A&M University. The goal of AutoKeras is to make machine learning accessible to everyone. Auto-Keras uses building blocks to quickly construct personalized models. With these blocks, users only need to specify the high-level architecture of the model. AutoKeras would search for the best detailed configuration, or users can override the base classes to create their own block.

Link: https://autokeras.com/

View Resource

Auto-PyTorch: An automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator.

Auto-PyTorch is able to jointly and robustly optimize the network architecture and the training hyperparameters to enable fully automated deep learning (AutoDL). Auto-PyTorch is mainly developed to support tabular data (classification, regression) and time series data (forecasting).

Link: https://automl.github.io/Auto-PyTorch/master/

View Resource

Auto-Sklearn: An automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator.

Auto-sklearn provides out-of-the-box supervised machine learning. Built around the scikit-learn machine learning library, auto-sklearn automatically searches for the right learning algorithm for a new machine learning dataset and optimizes its hyperparameters. Thus, it frees the machine learning practitioner from these tedious tasks and allows her to focus on the real problem.

Link: https://www.automl.org/automl-for-x/tabular-data/auto-sklearn/

View Resource

Auto-WEKA

Auto-WEKA is a Java-written machine learning automation tool that performs combined algorithm selection and hyperparameter optimization over the classification and regression algorithm implementations in WEKA, an open-source software package including a comprehensive collection of machine learning models. It applies techniques including meta-learning and Bayesian optimization to explore optimal hyperparameters. With the automated process, Auto-WEKA provides time-saving model selection.

Link: https://www.cs.ubc.ca/labs/algorithms/Projects/autoweka/#

View Resource

Deep Learning for Toxicology (DTox)

In drug development, a major reason for attrition is the lack of understanding of cellular mechanisms governing drug toxicity. The black-box nature of conventional classification models has limited their utility in identifying toxicity pathways. Here we developed DTox (Deep learning for Toxicology), an interpretation framework for knowledge-guided neural networks, which can predict compound response to toxicity assays and infer toxicity pathways of individual compounds. We demonstrate that DTox can achieve the same level of predictive performance as conventional models with a significant improvement in interpretability. Using DTox, we were able to rediscover mechanisms of transcription activation by three nuclear receptors, recapitulate cellular activities induced by aromatase inhibitors and PXR agonists, and differentiate distinctive mechanisms leading to HepG2 cytotoxicity. Virtual screening by DTox revealed that compounds with predicted cytotoxicity are at higher risk for clinical hepatic phenotypes. In summary, DTox provides a framework for deciphering cellular mechanisms of toxicity in silico.

 

View Resource

Extended Supervised Tracking and Classification System (scikit-ExSTraCS)

The scikit-ExSTraCS package includes a sklearn-compatible Python implementation of ExSTraCS 2.0. ExSTraCS 2.0, or Extended Supervised Tracking and Classifying System, implements the core components of a Michigan-Style Learning Classifier System (where the system’s genetic algorithm operates on a rule level, evolving a population of rules with each their own parameters) in an easy to understand way, while still being highly functional in solving ML problems. It allows the incorporation of expert knowledge in the form of attribute weights, attribute tracking, rule compaction, and a rule specificity limit, that makes it particularly adept at solving highly complex problems. In general, Learning Classifier Systems (LCSs) are a classification of Rule Based Machine Learning Algorithms that have been shown to perform well on problems involving high amounts of heterogeneity and epistasis. Well designed LCSs are also highly human interpretable. LCS variants have been shown to adeptly handle supervised and reinforced, classification and regression, online and offline learning problems, as well as missing or unbalanced data. These characteristics of versatility and interpretability give LCSs a wide range of potential applications, notably those in biomedicine.

 

View Resource