AutoML – Page 2 – Penn AI Tech

Hyperopt-Sklearn

July 22, 2024 by Elizabeth

Hyperopt-Sklearn (Hyperparameter optimization for Sklearn) is a Python library for hyperparameter-optimization-based model selection among machine learning algorithms in the Scikit-learn package. The main goal of Hyperopt-Sklearn is to automate and ease the process of hyperparameter tuning for machine learning models. It utilizes Bayesian optimization techniques to decrease the complexity of hyperparameter tuning and speed up the optimization process. It is a valuable tool for tuning hyperparameters and improving performance of Scikit-learn models without manual intervention.

Link: https://hyperopt.github.io/hyperopt-sklearn/

View Resource

Software | AutoML | Technology: Tools, Hardware, and Software

LAMA: LightAutoML

July 22, 2024 by Elizabeth

LightAutoML is an open-source Python library aimed at automated machine learning. It is designed to be lightweight and efficient for various tasks with tabular, text data. LightAutoML provides easy-to-use pipeline creation that enables: automatic hyperparameter tuning, data processing; automatic typing, feature selection; automatic time utilization; automatic report creation; and easy-to-use modular scheme to create your own pipelines.

Link: https://lightautoml.readthedocs.io/en/latest/

View Resource

Software | AutoML | Technology: Tools, Hardware, and Software

Ludwig: A low-code framework for building custom AI models like LLMs and other deep neural networks

July 22, 2024 by Elizabeth

Ludwig is a low-code framework for building custom AI models like LLMs and other deep neural networks. The Ludwig allows you to build custom models with ease. A declarative YAML configuration file is all you need to train a state-of-the-art LLM on your data and its support for multi-task and multi-modality learning. You can also optimize for scale and efficiency, since it also provides automatic batch size selection, distributed training (DDP, DeepSpeed), parameter efficient fine-tuning (PEFT), 4-bit quantization (QLoRA), and larger-than-memory datasets. By supporting hyperparameter optimization, explainability, and rich metric visualizations, you retain full control of your models down to the activation functions. It is modular and extensible and is engineered for production (Docker, HuggingFace).

Link: https://ludwig.ai/latest/

View Resource

Software | AutoML | Technology: Tools, Hardware, and Software

ML-Plan

July 22, 2024 by Elizabeth

ML-Plan is a Java-based free software library for AutoML and provides a tool to optimize machine learning pipelines in WEKA or Sklearn. It is one of the functionalities of AILibs, a modular collection of Java libraries related to automated decision making.

Link: https://starlibs.github.io/AILibs/projects/mlplan/

View Resource

Software | AutoML | Technology: Tools, Hardware, and Software

MLBox

July 22, 2024 by Elizabeth

MLBox is a powerful AutoML Python library that provides fast reading and distributed data preprocessing/cleaning/formatting, highly robust feature selection and leak detection, accurate hyperparameter optimization in high-dimensional space, state-of-the-art predictive models for classification and regression (Deep Learning, Stacking, LightGBM, etc.), and prediction with model interpretation.

Link: https://mlbox.readthedocs.io/en/latest/

View Resource

Software | AutoML | Technology: Tools, Hardware, and Software

MLJAR- supervised: Automated Machine Learning Python package that works with tabular data

July 8, 2024 by Elizabeth

MLJAR- supervised is an Automated Machine Learning Python package that works with tabular data. It is designed to save time for a data scientist. It abstracts the common way to preprocess the data, construct the machine learning models, and perform hyper-parameters tuning to find the best model. It is no black-box as you can see exactly how the ML pipeline is constructed (with a detailed Markdown report for each ML model). MLJAR- supervised will help you with:
(1) explaining and understanding your data,
(2) trying many different machine learning models,
(3) creating Markdown reports from analysis with details about all models,
(4) saving, re-running and loading the analysis and ML models.

Link: https://supervised.mljar.com/

View Resource

AutoML | Software | Technology: Tools, Hardware, and Software

MLme: Machine Learning Made Easy

July 22, 2024 by Elizabeth

MLme fulfills the diverse requirements of researchers while eliminating the need for extensive coding efforts by integrating four essential functionalities, namely data exploration, AutoML, CustomML, and visualization. MLme serves as a valuable resource that empowers researchers of all technical levels to leverage ML for insightful data analysis and enhance research outcomes. By simplifying and automating various stages of the ML workflow, it enables researchers to allocate more time to their core research tasks, thereby enhancing efficiency and productivity.

doi: 10.1101/2023.07.04.546825

View Resource

AutoML | Software | Technology: Tools, Hardware, and Software

PYCARET: An open-source, low-code machine learning library in Python

July 8, 2024 by Elizabeth

PyCaret is an open-source, low-code machine learning library in Python that aims to reduce the hypothesis to insight cycle time in an ML experiment. It enables data scientists to perform end-to-end experiments quickly and efficiently. With PyCaret, you spend less time coding and more time on analysis. In comparison with the other open-source machine learning libraries, PyCaret is an alternate low-code library that can be used to perform complex machine learning tasks with only a few lines of code. PyCaret is simple and easy to use.

Link: https://pycaret.org/
Youtube Link: https://www.youtube.com/channel/UCxA1YTYJ9BEeo50lxyI_B3g

View Resource

Software | AutoML | Technology: Tools, Hardware, and Software

RECIPE

July 22, 2024 by Elizabeth

RECIPE (REsilient ClassifIcation Pipeline Evolution) is an AutoML framework based on a grammar-based genetic programming algorithm that builds customized classification pipelines. The framework is flexible enough to receive different grammars and can be easily extended to other machine learning tasks. It overcomes the drawbacks of previous evolutionary-based frameworks, such as generating invalid individuals, and organizes a high number of possible suitable data pre-processing and classification methods into a grammar.

Link: https://laic-ufmg.github.io/Recipe/docs/

View Resource

Software | AutoML | Technology: Tools, Hardware, and Software

TransmogrifAI

July 22, 2024 by Elizabeth

TransmogrifAI is an end-to-end Auto-ML library for structured data written in Scala that runs on top of Apache Spark, an open-source unified analytics engine for large-scale data processing. It was developed with a focus on accelerating machine learning developer productivity through machine learning automation, and an API that enforces compile-time type-safety, modularity, and reuse.

For automation, TransmogrifAI has numerous Transformers and Estimators that make use of Feature abstractions to automate feature engineering, feature validation, and model selection.

For modularity and reuse, TransmogrifAI enforces a strict separation between ML workflow definitions and data manipulation, ensuring that code written using TransmogrifAI is inherently modular and reusable.

For compile-time type-safety, machine learning workflows built using TransmogrifAI are strongly typed. This means developers get to enjoy the many benefits of compile-time type safety, including code completion during development and fewer runtime errors.

For transparency, model insights leverage stored feature metadata and lineage to help debug models while providing insights to the end user, making machine learning models less of a black box.

Link: https://transmogrif.ai/

View Resource