Archived Documentation

Welcome to the developer documentation for SigOpt. If you have a question you can’t answer, feel free to contact us!
You are currently viewing archived SigOpt documentation. The newest documentation can be found here.

Experiment and Optimization Tutorial

We'll walk through an example of instrumenting a model in order to run a model parameter optimization with SigOpt. In this tutorial, you will learn how to:

  • Install the SigOpt Python client
  • Set your SigOpt API token
  • Set the project
  • Load the extension (Notebook environment only)
  • Instrument your model
  • Configure your optimization experiment
  • Create your first experiment and optimize your model metric withSigOpt
  • Visualize your experiment results

Step 1 - Install SigOpt Python Client Back to Top

Install the SigOpt Python package and the libraries required to run the model used for this tutorial. This example has been tested with xgboost 1.4.2 and scikit-learn 0.24.2.

! pip install sigopt
! pip install xgboost==1.4.2
! pip install scikit-learn==0.24.2

Step 2 - Set Your API Token Back to Top

Once you've installed SigOpt, you need to get your API token in order to use the SigOpt API and later explore your runs in the SigOpt app. To find your API token, go directly to the API Token page.

import os
os.environ['SIGOPT_API_TOKEN'] = MY_API_TOKEN

Step 3 - Set Project Back to Top

Training runs are created within projects. The project allows you to sort and filter through your training runs and view useful charts to gain insights into everything you've tried.

os.environ['SIGOPT_PROJECT'] = "SigOpt_Run_XGB_Classifier"

Step 4 - Load the SigOpt Extension Back to Top

If you're not in a notebook environment, skip to the next step.

If you're in a notebook environment, load the SigOpt extension to enable magic commands.

import sigopt
%load_ext sigopt

Step 5 - Instrument Your Model Back to Top

Use SigOpt methods to log and track key model information.

from xgboost import XGBClassifier
from sklearn.multiclass import OneVsRestClassifier
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn import datasets
import numpy
import sigopt
import time

DATASET_NAME = "Sklearn Wine"
FEATURE_ENG_PIPELINE_NAME = "Sklearn Standard Scalar"
PREDICTION_TYPE = "Multiclass"
DATASET_SRC = "sklearn.datasets"


def get_data():
    """
    Load sklearn wine dataset, and scale features to be zero mean, unit variance.
    One hot encode labels (3 classes), to be used by sklearn OneVsRestClassifier.
    """
    data = datasets.load_wine()
    X = data["data"]
    y = data["target"]
    scaler = StandardScaler()
    X_scaled = scaler.fit_transform(X)
    enc = OneHotEncoder()
    Y = enc.fit_transform(y[:, numpy.newaxis]).toarray()
    return (X_scaled, Y)


MODEL_NAME = "OneVsRestClassifier(XGBoostClassifier)"


def evaluate_xgboost_model(
    X, y, number_of_cross_val_folds=5, max_depth=6, learning_rate=0.3, min_split_loss=0
):
    t0 = time.time()
    classifier = OneVsRestClassifier(
        XGBClassifier(
            objective="binary:logistic",
            max_depth=max_depth,
            learning_rate=learning_rate,
            min_split_loss=min_split_loss,
            use_label_encoder=False,
            verbosity=0,
        )
    )
    cv_accuracies = cross_val_score(classifier, X, y, cv=number_of_cross_val_folds)
    tf = time.time()
    training_and_validation_time = tf - t0
    return numpy.mean(cv_accuracies), training_and_validation_time


def run_and_track_in_sigopt():

    (features, labels) = get_data()

    sigopt.log_dataset(DATASET_NAME)
    sigopt.log_metadata(key="Dataset Source", value=DATASET_SRC)
    sigopt.log_metadata(
        key="Feature Eng Pipeline Name", value=FEATURE_ENG_PIPELINE_NAME
    )
    sigopt.log_metadata(
        key="Dataset Rows", value=features.shape[0]
    )  # assumes features X are like a numpy array with shape
    sigopt.log_metadata(key="Dataset Columns", value=features.shape[1])
    sigopt.log_metadata(key="Execution Environment", value="Colab Notebook")
    sigopt.log_model(MODEL_NAME)

    args = dict(
        X=features,
        y=labels,
        max_depth=sigopt.get_parameter(
            "max_depth", default=numpy.random.randint(low=3, high=15, dtype=int)
        ),
        learning_rate=sigopt.get_parameter(
            "learning_rate", default=numpy.random.random(size=1)[0]
        ),
        min_split_loss=sigopt.get_parameter(
            "min_split_loss", default=numpy.random.random(size=1)[0] * 10
        ),
    )

    mean_accuracy, training_and_validation_time = evaluate_xgboost_model(**args)

    sigopt.log_metric(name="accuracy", value=mean_accuracy)
    sigopt.log_metric(
        name="training and validation time (s)", value=training_and_validation_time
    )


6 - Define Your Experiment Configuration Back to Top

The experiment definition will include the name, project, parameters, metrics and other options that you would like to run your experiment with.

The parameters should match the names provided to sigopt.get_parameter in your code and similarly the metrics should match the names of the sigopt.log_metric calls.

%%experiment
{
    "name": "XGBoost Optimization",
    "metrics": [
        {
            "name": "accuracy",
            "strategy": "optimize",
            "objective": "maximize",
        }
    ],
    "parameters": [
        {"name": "max_depth", "type": "int", "bounds": {"min": 3, "max": 12}},
        {"name": "learning_rate", "type": "double", "bounds": {"min": 0.0, "max": 1.0}},
        {
            "name": "min_split_loss",
            "type": "double",
            "bounds": {"min": 0.0, "max": 10.0},
        },
    ],
    "observation_budget": 20,
}

Step 7 - Run the Code Back to Top

Run your model training run code, with SigOpt methods integrated.

%%optimize My_First_Optimization
run_and_track_in_sigopt()

Once you've run the code above, SigOpt will conveniently output links to the runs and experiment pages on our web application.

Step 8 - Visualize Your Experiment Results Back to Top

Click on an individual run link to view your completed run in our web application. Here's a view of a training run page:

From the run page, click on the Experiment Name to open the corresponding Experiment page. Or click one of the printed experiment page links from the program output.

Conclusion Back to Top

In this tutorial, we covered the recommended way to instrument and optimize your model, and visualize your results with SigOpt. You learned that experiments are collections of runs that search through a defined parameter space to satisfy the experiment search criteria.

Check out our tutorial, Training Runs Tutorial, for a closer look at a single training run, and see how to track one-off training runs without creating an experiment.