What it is
MLflow is an open-source platform for managing the end-to-end machine learning lifecycle, including experimentation, reproducibility, and deployment. The CLI allows you to interact with MLflow projects, runs, models, and experiments from your terminal.
Installation
Linux/macOS:
pip install mlflow
Windows:
pip install mlflow
Core Concepts
- Experiments: A collection of MLflow Runs. You can group runs by experiment to organize your work.
- Runs: A single execution of your ML code. Each run records parameters, metrics, and artifacts.
- Parameters: Key-value pairs that configure your ML code (e.g., learning rate, batch size).
- Metrics: Key-value pairs that measure the performance of your ML code (e.g., accuracy, loss). These can be updated dynamically during a run.
- Artifacts: Any file or directory produced by a run (e.g., trained models, plots, data files).
- Models: MLflow can log and package models in standard formats, making them easy to deploy.
- Model Registry: A centralized place to manage the lifecycle of MLflow models, including versions, stages (Staging, Production, Archived), and annotations.
Commands / Usage
Managing Experiments
-
Create a new experiment:
mlflow experiments create "My New Experiment"Creates a new experiment named "My New Experiment".
-
List experiments:
mlflow experiments listLists all available MLflow experiments with their IDs, names, and artifact locations.
-
Get experiment details:
mlflow experiments get-by-name "My New Experiment"Retrieves details for the experiment named "My New Experiment".
-
Delete an experiment:
mlflow experiments delete <experiment_id>Deletes the experiment with the specified ID.
-
Set the active experiment:
mlflow experiments set-active <experiment_id>Sets the specified experiment as the active one for subsequent
mlflow runcommands.
Managing Runs
-
Run an MLflow project:
mlflow run my_project/ -P alpha=0.5 -P beta=0.7Executes the MLflow project located in the
my_project/directory, passingalpha=0.5andbeta=0.7as parameters. -
Run an MLflow project from a Git repository:
mlflow run git@github.com:mlflow/mlflow-example.git -e train_model --version 1.0.0 -P data_path=./dataRuns the
train_modelentry point from the specified Git repository at version1.0.0, passingdata_path=./dataas a parameter. -
List runs within an experiment:
mlflow runs list --experiment-id <experiment_id>Lists all runs belonging to the specified experiment ID.
-
Get run details:
mlflow runs info <run_id>Retrieves detailed information about a specific run ID, including its parameters, metrics, and artifacts.
-
Download run artifacts:
mlflow artifacts download --run-id <run_id> --artifact-path modelsDownloads the "models" artifact directory from the specified run ID.
-
Log a parameter:
mlflow log-param --run-id <run_id> learning_rate 0.01Logs the
learning_rateparameter with value0.01to the specified run. -
Log a metric:
mlflow log-metric --run-id <run_id> accuracy 0.95Logs the
accuracymetric with value0.95to the specified run. -
Log a metric with a timestamp:
mlflow log-metric --run-id <run_id> loss 0.12 --timestamp 1678886400000Logs the
lossmetric with value0.12and a specific Unix timestamp. -
Log an artifact:
mlflow log-artifact --run-id <run_id> my_model.pklLogs the local file
my_model.pklas an artifact for the specified run. -
Log a directory of artifacts:
mlflow log-artifact --run-id <run_id> --recursive output_dir/Logs all files within the
output_dir/directory recursively as artifacts for the specified run. -
End a run:
mlflow end-run --run-id <run_id>Marks the specified run as completed.
Managing Models
-
Log a model:
mlflow pyfunc log-model --model-path my_model/ --artifact-path my_mlflow_model --code-path my_model/code.pyLogs a Python function model from the
my_model/directory as an artifact namedmy_mlflow_model, includingmy_model/code.py. -
Search for logged models:
mlflow models searchSearches for all logged models.
-
Search for models within a specific experiment:
mlflow models search --experiment-id <experiment_id>Searches for logged models within the specified experiment.
-
Serve a model locally:
mlflow models serve --model-uri runs:/<run_id>/my_mlflow_model --port 5000Serves the MLflow model logged at
runs:/<run_id>/my_mlflow_modellocally on port 5000. -
Build a Docker image for a model:
mlflow models build-docker --model-uri runs:/<run_id>/my_mlflow_model --name my-model-imageBuilds a Docker image named
my-model-imagefor the MLflow model.
Model Registry Management
-
Register a model from a run artifact:
mlflow registered-models create "My Model" mlflow models register --model-uri runs:/<run_id>/my_mlflow_model --name "My Model"Registers the MLflow model from the specified run artifact as a new model named "My Model" in the Model Registry.
-
List registered models:
mlflow registered-models listLists all registered models in the Model Registry.
-
Show details of a registered model:
mlflow registered-models get "My Model"Retrieves details for the registered model named "My Model".
-
Create a new version for a registered model:
mlflow models register --model-uri runs:/<run_id>/my_mlflow_model --name "My Model" --version "2"Registers the MLflow model from the specified run artifact as version "2" of the registered model "My Model".
-
Transition a model version’s stage:
mlflow registered-models transition-stage --name "My Model" --version "2" --stage "Production"Transitions version "2" of the registered model "My Model" to the "Production" stage.
-
Delete a registered model:
mlflow registered-models delete --name "My Model"Deletes the registered model named "My Model".
-
Delete a specific version of a registered model:
mlflow registered-models delete-version --name "My Model" --version "2"Deletes version "2" of the registered model "My Model".
MLflow UI
-
Start the MLflow UI:
mlflow uiLaunches the MLflow UI in your browser, typically at
http://localhost:5000. -
Start the MLflow UI with a specific backend store:
mlflow ui --backend-store-uri postgresql://user:password@host:port/databaseStarts the MLflow UI pointing to a PostgreSQL backend store.
-
Start the MLflow UI with a specific artifact root:
mlflow ui --default-artifact-root gs://my-mlflow-bucket/artifactsStarts the MLflow UI, setting the default artifact root to a Google Cloud Storage bucket.
Tracking Server
- Start a tracking server:
Starts a tracking server accessible from all network interfaces on port 5001, using a local SQLite database for tracking and a local directory for artifacts.mlflow server --host 0.0.0.0 --port 5001 --backend-store-uri sqlite:///mlruns.db --default-artifact-root ./mlruns/artifacts
Common Patterns
-
Running a project and then viewing results in the UI:
mlflow run my_project/ -P learning_rate=0.001 mlflow uiFirst, execute an MLflow project with a specific parameter. Then, launch the MLflow UI to visualize the run and its metrics/parameters.
-
Downloading all artifacts from the latest run of an experiment:
LATEST_RUN_ID=$(mlflow runs list --experiment-id <experiment_id> --max-results 1 | awk '{print $1}') mlflow artifacts download --run-id $LATEST_RUN_ID --artifact-path .This snippet finds the ID of the latest run in an experiment and then downloads all its artifacts.
-
Logging a model and then registering it:
# Assuming 'train_script.py' logs a model to 'model_output/' mlflow run train_script.py -e train --experiment-id <experiment_id> RUN_ID=$(mlflow runs list --experiment-id <experiment_id> --max-results 1 | awk '{print $1}') mlflow models register --model-uri runs:/$RUN_ID/model_output --name "MyAwesomeModel"Executes a training script, captures the run ID, and then registers the logged model artifact to the Model Registry.
-
Deploying a model from the Model Registry to a local server:
mlflow models serve --model-uri models:/MyModel/Production --port 12345Serves the registered model "MyModel" currently in the "Production" stage locally on port 12345.
-
Setting MLflow tracking URI for remote tracking:
export MLFLOW_TRACKING_URI="http://your-mlflow-server:5000" mlflow run my_project/Configures the MLflow client to send tracking data to a remote MLflow tracking server.
Gotchas
- Default Experiment: If you don’t explicitly create or set an experiment, MLflow logs to a default experiment named "Default" (ID 0).
- Artifact Location: By default, artifacts are stored locally in
./mlruns/. For production or collaborative use, configure a remote artifact store (e.g., S3, GCS, Azure Blob Storage) viaMLFLOW_TRACKING_URIor themlflow ui --default-artifact-rootflag. mlflow runvs.mlflow execute:mlflow runis used to execute MLflow Projects (defined byMLprojectfiles), which can include parameter handling and environment management.mlflow executeis a lower-level command for executing Python scripts within an MLflow run context.- Model Registry Stages: Stages like "Staging" and "Production" are purely organizational. MLflow does not automatically deploy models to production when you set a stage. You need separate deployment mechanisms.
- Overwriting Artifacts: Logging an artifact with the same name and path within the same run will overwrite the previous one.
mlflow.set_tracking_uri()vs.MLFLOW_TRACKING_URIenvironment variable: Both achieve the same goal. The environment variable is often preferred for setting it once for a session or system-wide.mlflow log-metrictimestamp: If not provided, MLflow uses the current time. This is important for understanding metric evolution over time, especially when replaying runs or comparing runs logged at different intervals.