pystruct.learners.SubgradientSSVM

class pystruct.learners.SubgradientSSVM(model, max_iter=100, C=1.0, verbose=0, momentum=0.0, learning_rate='auto', n_jobs=1, show_loss_every=0, decay_exponent=1, break_on_no_constraints=True, logger=None, batch_size=None, decay_t0=10, averaging=None, shuffle=False)[source]

Structured SVM solver using subgradient descent.

Implements a margin rescaled with l1 slack penalty. By default, a constant learning rate is used. It is also possible to use the adaptive learning rate found by AdaGrad.

This class implements online subgradient descent. If n_jobs != 1, small batches of size n_jobs are used to exploit parallel inference. If inference is fast, use n_jobs=1.

Parameters:

model : StructuredModel

Object containing model structure. Has to implement loss, inference and loss_augmented_inference.

max_iter : int, default=100

Maximum number of passes over dataset to find constraints and perform updates.

C : float, default=1.

Regularization parameter.

verbose : int, default=0

Verbosity.

learning_rate : float or ‘auto’, default=’auto’

Learning rate used in subgradient descent. If ‘auto’, the pegasos schedule is used, which starts with learning_rate = n_samples * C.

momentum : float, default=0.0

Momentum used in subgradient descent.

n_jobs : int, default=1

Number of parallel jobs for inference. -1 means as many as cpus.

batch_size : int, default=None

Ignored if n_jobs > 1. If n_jobs=1, inference will be done in mini batches of size batch_size. If n_jobs=-1, batch learning will be performed, that is the whole dataset will be used to compute each subgradient.

show_loss_every : int, default=0

Controlls how often the hamming loss is computed (for monitoring purposes). Zero means never, otherwise it will be computed very show_loss_every’th epoch.

decay_exponent : float, default=1

Exponent for decaying learning rate. Effective learning rate is learning_rate / (decay_t0 + t)** decay_exponent. Zero means no decay.

decay_t0 : float, default=10

Offset for decaying learning rate. Effective learning rate is learning_rate / (decay_t0 + t)** decay_exponent.

break_on_no_constraints : bool, default=True

Break when there are no new constraints found.

logger : logger object.

averaging : string, default=None

Whether and how to average weights. Possible options are ‘linear’, ‘squared’ and None. The string reflects the weighting of the averaging:

  • linear: w_avg ~ w_1 + 2 * w_2 + ... + t * w_t
  • squared: w_avg ~ w_1 + 4 * w_2 + ... + t**2 * w_t

Uniform averaging is not implemented as it is worse than linear weighted averaging or no averaging.

shuffle : bool, default=False

Whether to shuffle the dataset in each iteration.

Attributes:

w : nd-array, shape=(model.size_joint_feature,)

The learned weights of the SVM.

``loss_curve_`` : list of float

List of loss values if show_loss_every > 0.

``objective_curve_`` : list of float

Primal objective after each pass through the dataset.

``timestamps_`` : list of int

Total training time stored before each iteration.

References

  • Nathan Ratliff, J. Andrew Bagnell and Martin Zinkevich:

    (Online) Subgradient Methods for Structured Prediction, AISTATS 2007

  • Shalev-Shwartz, Shai and Singer, Yoram and Srebro, Nathan and Cotter,

    Andrew: Pegasos: Primal estimated sub-gradient solver for svm, Mathematical Programming 2011

Methods

fit(X, Y[, constraints, warm_start, initialize]) Learn parameters using subgradient descent.
get_params([deep]) Get parameters for this estimator.
predict(X) Predict output on examples in X.
score(X, Y) Compute score as 1 - loss over whole data set.
set_params(**params) Set the parameters of this estimator.
__init__(model, max_iter=100, C=1.0, verbose=0, momentum=0.0, learning_rate='auto', n_jobs=1, show_loss_every=0, decay_exponent=1, break_on_no_constraints=True, logger=None, batch_size=None, decay_t0=10, averaging=None, shuffle=False)[source]
fit(X, Y, constraints=None, warm_start=False, initialize=True)[source]

Learn parameters using subgradient descent.

Parameters:

X : iterable

Traing instances. Contains the structured input objects. No requirement on the particular form of entries of X is made.

Y : iterable

Training labels. Contains the strctured labels for inputs in X. Needs to have the same length as X.

constraints : None

Discarded. Only for API compatibility currently.

warm_start : boolean, default=False

Whether to restart a previous fit.

initialize : boolean, default=True

Whether to initialize the model for the data. Leave this true except if you really know what you are doing.

get_params(deep=True)

Get parameters for this estimator.

Parameters:

deep: boolean, optional :

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params : mapping of string to any

Parameter names mapped to their values.

predict(X)

Predict output on examples in X.

Parameters:

X : iterable

Traing instances. Contains the structured input objects.

Returns:

Y_pred : list

List of inference results for X using the learned parameters.

score(X, Y)

Compute score as 1 - loss over whole data set.

Returns the average accuracy (in terms of model.loss) over X and Y.

Parameters:

X : iterable

Evaluation data.

Y : iterable

True labels.

Returns:

score : float

Average of 1 - loss over training examples.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:self :