Software – Page 2 – Penn AI Tech

FEDOT

July 22, 2024 by Elizabeth

FEDOT is an open-source framework for automated modeling and machine learning (AutoML) problems. This framework is distributed under the 3-Clause BSD license. It provides automatic generative design of machine learning pipelines for various real-world problems. The core of FEDOT is based on an evolutionary approach and supports classification (binary and multiclass), regression, clustering, and time series prediction problems.

Link: https://fedot.readthedocs.io/en/latest/

View Resource

Software | AutoML | Technology: Tools, Hardware, and Software

FLAML: A Fast Library for Automated Machine Learning & Tuning

July 22, 2024 by Elizabeth

FLAML is a lightweight Python library for efficient automation of machine learning and AI operations. It automates workflow based on large language models, machine learning models, and optimizes their performance.

Link: https://microsoft.github.io/FLAML/

View Resource

Software | AutoML | Technology: Tools, Hardware, and Software

GAMA: (General Automated Machine learning Assistant) An automated machine learning tool based on genetic programming.

July 22, 2024 by Elizabeth

GAMA is an AutoML package for end-users and AutoML researchers. It generates optimized machine learning pipelines given specific input data and resource constraints. A machine learning pipeline contains data preprocessing (e.g. PCA, normalization) as well as a machine learning algorithm (e.g. Logistic Regression, Random Forests), with fine-tuned hyperparameter settings (e.g. number of trees in a Random Forest). To find these pipelines, multiple search procedures have been implemented. GAMA can also combine multiple tuned machine learning pipelines together into an ensemble, which on average should help model performance. At the moment, GAMA is restricted to classification and regression problems on tabular data. In addition to its general use AutoML functionality, GAMA aims to serve AutoML researchers as well. During the optimization process, GAMA keeps an extensive log of progress made. Using this log, insight can be obtained on the behavior of the search procedure.

Link: https://openml-labs.github.io/gama/master/

View Resource

Software | Data simulation | Technology: Tools, Hardware, and Software | All Resources

Genetic Architecture Model Emulator for Testing and Evaluating Software (GAMETES)

March 5, 2023 by Ray

GAMETES is an algorithm for the generation of complex single nucleotide polymorphism (SNP) models for simulated association studies. GAMETES is designed to generate epistatic models which we refer to as pure and strict. These models constitute the worst-case in terms of detecting disease associations, since such associations may only be observed if all n loci are included in the disease model. The user-friendly GAMETES software rapidly and precisely generates epistatic multi-locus models, and using these models, can also generate simulated datasets exhibiting epistasis. Version 2.2 adds the ability to generate heterogeneous datasets by applying multiple independent models to different subsets of the simulated data. Further additional features include the facility to create additive datasets by applying multiple independent models to the entire dataset, as well as functionality for the design of continuous endpoints. Additionally, we have added a custom model generation feature, so that users may directly specify and examine the properties of any 2 or 3 locus SNP model. Simple Mendelian models may also be generated with this feature.

View Resource

Software | AutoML | Technology: Tools, Hardware, and Software

H2O AutoML

July 22, 2024 by Elizabeth

H2O is an in-memory platform for distributed, scalable machine learning. H2O uses familiar interfaces like R, Python, Scala, Java, JSON and the Flow notebook/web interface, and works seamlessly with big data technologies like Hadoop and Spark. H2O provides implementations of many popular algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks, Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), Cox Proportional Hazards, K-Means, PCA, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Link: https://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html

View Resource

Software | AutoML | Technology: Tools, Hardware, and Software

Hyperopt-Sklearn

July 22, 2024 by Elizabeth

Hyperopt-Sklearn (Hyperparameter optimization for Sklearn) is a Python library for hyperparameter-optimization-based model selection among machine learning algorithms in the Scikit-learn package. The main goal of Hyperopt-Sklearn is to automate and ease the process of hyperparameter tuning for machine learning models. It utilizes Bayesian optimization techniques to decrease the complexity of hyperparameter tuning and speed up the optimization process. It is a valuable tool for tuning hyperparameters and improving performance of Scikit-learn models without manual intervention.

Link: https://hyperopt.github.io/hyperopt-sklearn/

View Resource

Software | AutoML | Technology: Tools, Hardware, and Software

LAMA: LightAutoML

July 22, 2024 by Elizabeth

LightAutoML is an open-source Python library aimed at automated machine learning. It is designed to be lightweight and efficient for various tasks with tabular, text data. LightAutoML provides easy-to-use pipeline creation that enables: automatic hyperparameter tuning, data processing; automatic typing, feature selection; automatic time utilization; automatic report creation; and easy-to-use modular scheme to create your own pipelines.

Link: https://lightautoml.readthedocs.io/en/latest/

View Resource

Software | AutoML | Technology: Tools, Hardware, and Software

Ludwig: A low-code framework for building custom AI models like LLMs and other deep neural networks

July 22, 2024 by Elizabeth

Ludwig is a low-code framework for building custom AI models like LLMs and other deep neural networks. The Ludwig allows you to build custom models with ease. A declarative YAML configuration file is all you need to train a state-of-the-art LLM on your data and its support for multi-task and multi-modality learning. You can also optimize for scale and efficiency, since it also provides automatic batch size selection, distributed training (DDP, DeepSpeed), parameter efficient fine-tuning (PEFT), 4-bit quantization (QLoRA), and larger-than-memory datasets. By supporting hyperparameter optimization, explainability, and rich metric visualizations, you retain full control of your models down to the activation functions. It is modular and extensible and is engineered for production (Docker, HuggingFace).

Link: https://ludwig.ai/latest/

View Resource

Software | AutoML | Technology: Tools, Hardware, and Software

ML-Plan

July 22, 2024 by Elizabeth

ML-Plan is a Java-based free software library for AutoML and provides a tool to optimize machine learning pipelines in WEKA or Sklearn. It is one of the functionalities of AILibs, a modular collection of Java libraries related to automated decision making.

Link: https://starlibs.github.io/AILibs/projects/mlplan/

View Resource

Software | AutoML | Technology: Tools, Hardware, and Software

MLBox

July 22, 2024 by Elizabeth

MLBox is a powerful AutoML Python library that provides fast reading and distributed data preprocessing/cleaning/formatting, highly robust feature selection and leak detection, accurate hyperparameter optimization in high-dimensional space, state-of-the-art predictive models for classification and regression (Deep Learning, Stacking, LightGBM, etc.), and prediction with model interpretation.

Link: https://mlbox.readthedocs.io/en/latest/

View Resource