Accelerate generative AI development with Amazon SageMaker AI and MLflow

Efficiently manage the machine learning and generative AI lifecycle at scale using MLflow 3.0

Why use Amazon SageMaker with MLflow?

Amazon SageMaker offers a managed MLflow capability for machine learning (ML) and generative AI experimentation. This capability makes it easy for data scientists to use MLflow on SageMaker for model training, registration, and deployment. Admins can quickly set up secure and scalable MLflow environments on AWS. Data scientists and ML developers can efficiently track ML experiments and find the right model for a business problem.

Benefits of Amazon SageMaker AI with MLflow 3.0

Data Scientists can use MLflow to keep track of all the metrics generated during fine-tuning of a foundation model, evaluate the model, test the model with sample data, compare the outputs of each model side by side on the MLflow UI, and register the right model for their use case. Once they register the model, ML engineers can deploy the model to SageMaker inference.
You do not need to manage any infrastructure required to host MLflow. Data Scientists can use all the MLflow open source capabilities without admins worrying about the infrastructure overhead. This saves time and cost when setting up data science environments. MLflow is integrated with Amazon Identity and Access Management (IAM), allowing you to set up Role Based Access Control (RBAC) for MLflow Tracking Servers.
Models registered in MLflow will automatically be registered to the Amazon SageMaker Model Registry with an associated Amazon SageMaker Model Card. This enables data scientists to transition their models to ML engineers for production deployment without switching context. ML Engineers can deploy models from MLflow to SageMaker endpoints without building custom containers or repackaging the MLflow model artifacts.
As the MLflow project evolves, SageMaker AI customers will benefit from the open-source innovation from the MLflow community while enjoying the infrastructure management provided by AWS.
Tracing capabilities in fully managed MLflow 3.0 enable customers to record the inputs, outputs, and metadata at every step of gen AI development to help teams quickly identify the source of bugs or unexpected behaviors. By maintaining records of each model and application version, fully managed MLflow 3.0 offers traceability to connect AI responses to their source components, allowing developers to quickly trace an issue directly to the specific code, data, or parameters that generated it.

Track experiments from anywhere

ML experiments are performed in diverse environments, including local notebooks, IDEs, cloud-based training code, or managed IDEs in Amazon SageMaker Studio. With SageMaker AI and MLflow, you can use your preferred environment to train models, track your experiments in MLflow, and launch the MLflow UI directly or through SageMaker Studio for analysis.

Log Experiments

Accelerate generative AI development with MLflow 3.0

Building foundation models is an iterative process, involving hundreds of training iterations to find the best algorithm, architecture, and parameters for optimal model accuracy. Fully-managed MLflow 3.0 enables you to track gen AI experiments, evaluate model performance, and gain deeper insights into the behavior of models and AI applications from experimentation to production. With a single interface, you can visualize-progress training jobs, collaborate with colleagues during experimentation, and maintain version control for each model and application. MLflow 3.0 also offers advanced tracing capabilities that record the inputs, outputs, and metadata at every step of gen AI development, enabling you to quickly identify the source of bugs or unexpected behaviors.

Accelerate generative AI development with MLflow

Centrally manage ML experiments metadata

Evaluate experiments

Identifying the best model from multiple iterations requires analysis and comparison of model performance. MLflow offers visualizations such as scatter plots, bar charts, and histograms to compare training iterations. Additionally, MLflow enables the evaluation of models for bias and fairness.

Evaluate your ML experiments

Centrally manage MLflow models

Multiple teams often use MLflow to manage their experiments, with only some models becoming candidates for production. Organizations need an easy way to keep track of all candidate models to make informed decisions about which models proceed to production. MLflow integrates seamlessly with SageMaker Model Registry, allowing organizations to see their models registered in MLflow automatically appear in SageMaker Model Registry, complete with a SageMaker Model Card for governance. This integration enables data scientists and ML engineers to use distinct tools for their respective tasks: MLflow for experimentation and SageMaker Model Registry for managing the production lifecycle with comprehensive model lineage.

Share updates and results

Deploy MLflow Models to SageMaker endpoints

Deploying models from MLflow to SageMaker Endpoints is seamless, eliminating the need to build custom containers for model storage. This integration allows customers to leverage SageMaker’s optimized inference containers while retaining the user-friendly experience of MLflow for logging and registering models.

Reproduce and audit ML experiments