Documentation

Welcome to the developer documentation for SigOpt. If you have a question you can’t answer, feel free to contact us!
Welcome to the new SigOpt docs! If you're looking for the classic SigOpt documentation then you can find that here. Otherwise, happy optimizing!

Runs Tutorial

We'll walk through an example of instrumenting and executing run code with SigOpt. In this tutorial, you will learn how to:

  • Install the SigOpt Python client
  • Set your SigOpt API token
  • Set the project
  • Instrument your model
  • Create your first Run and log your model metric and parameters toSigOpt
  • View your Run in the SigOpt web application

For notebook instructions and tutorials, check out our GitHub notebook tutorials repo or directly open the Run notebook tutorial in Google Colab by clicking the button below:

Step 1 - Install SigOpt Python Client Back to Top

Install the SigOpt Python package and the libraries required to run the model used for this tutorial. This example has been tested with xgboost 1.4.2 and scikit-learn 0.24.2.

$ pip install sigopt xgboost==1.4.2 scikit-learn==0.24.2

# to confirm that sigopt is installed
$ sigopt --help

Step 2 - Set Your API Token Back to Top

Once you've installed SigOpt, you need to get your API token in order to use the SigOpt API and later explore your Runs and Experiments in the SigOpt app. To find your API token, go directly to the API Token page.

$ sigopt config

Step 3 - Set Project Back to Top

Runs are created within projects. The project allows you to sort and filter through your Runs and Experiments and view useful charts to gain insights into everything you've tried.

$ export SIGOPT_PROJECT=sigopt_run_xgb_classifier

Step 4 - Instrument Your Model Back to Top

Use SigOpt methods to log and track key model information.

from xgboost import XGBClassifier
from sklearn.multiclass import OneVsRestClassifier
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn import datasets
import numpy
import sigopt
import time

DATASET_NAME = "Sklearn Wine"
FEATURE_ENG_PIPELINE_NAME = "Sklearn Standard Scalar"
PREDICTION_TYPE = "Multiclass"
DATASET_SRC = "sklearn.datasets"


def get_data():
  """
    Load sklearn wine dataset, and scale features to be zero mean, unit variance.
    One hot encode labels (3 classes), to be used by sklearn OneVsRestClassifier.
    """
  data = datasets.load_wine()
  X = data["data"]
  y = data["target"]
  scaler = StandardScaler()
  X_scaled = scaler.fit_transform(X)
  enc = OneHotEncoder()
  Y = enc.fit_transform(y[:, numpy.newaxis]).toarray()
  return (X_scaled, Y)


MODEL_NAME = "OneVsRestClassifier(XGBoostClassifier)"


def evaluate_xgboost_model(X, y, number_of_cross_val_folds=5, max_depth=6, learning_rate=0.3, min_split_loss=0):
  t0 = time.time()
  classifier = OneVsRestClassifier(
    XGBClassifier(
      objective="binary:logistic",
      max_depth=max_depth,
      learning_rate=learning_rate,
      min_split_loss=min_split_loss,
      use_label_encoder=False,
      verbosity=0,
    )
  )
  cv_accuracies = cross_val_score(classifier, X, y, cv=number_of_cross_val_folds)
  tf = time.time()
  training_and_validation_time = tf - t0
  return numpy.mean(cv_accuracies), training_and_validation_time


def run_and_track_in_sigopt():

  (features, labels) = get_data()

  sigopt.log_dataset(DATASET_NAME)
  sigopt.log_metadata(key="Dataset Source", value=DATASET_SRC)
  sigopt.log_metadata(key="Feature Eng Pipeline Name", value=FEATURE_ENG_PIPELINE_NAME)
  sigopt.log_metadata(
    key="Dataset Rows", value=features.shape[0]
  )  # assumes features X are like a numpy array with shape
  sigopt.log_metadata(key="Dataset Columns", value=features.shape[1])
  sigopt.log_metadata(key="Execution Environment", value="Colab Notebook")
  sigopt.log_model(MODEL_NAME)
  sigopt.params.setdefaults(
    max_depth=numpy.random.randint(low=3, high=15),
    learning_rate=numpy.random.random(size=1)[0],
    min_split_loss=numpy.random.random(size=1)[0] * 10,
  )

  args = dict(
    X=features,
    y=labels,
    max_depth=sigopt.params.max_depth,
    learning_rate=sigopt.params.learning_rate,
    min_split_loss=sigopt.params.min_split_loss,
  )

  mean_accuracy, training_and_validation_time = evaluate_xgboost_model(**args)

  sigopt.log_metric(name="accuracy", value=mean_accuracy)
  sigopt.log_metric(name="training and validation time (s)", value=training_and_validation_time)


run_and_track_in_sigopt()

Step 5 - Run the Code Back to Top

Run your model Run code, with SigOpt methods integrated.

$ sigopt run python run_and_track_in_sigopt.py

Step 6 - View Your Run Back to Top

Click on the Run link to view your completed Run in our web application. Here's a view of a Run page:

Conclusion Back to Top

In this tutorial, we've covered the recommended way to instrument your Run with SigOpt. After your model has been instrumented, it is easy to take advantage of SigOpt's optimization features. Optimization helps find the parameters for your model that give you the best metric (e.g. maximizing an accuracy metric).

Check out the tutorial, Experiment and Optimization Tutorial, to see how you can create an Experiment.