Multi-label Data Sets¶

This sub-module provides loading of data sets and down sampling of the label space.

skml.datasets.load_dataset(name)[source]¶

Loads a multi-label classification dataset.

Parameters:	name : string Name of the dataset. Currently only ‘yeast’ is available.

skml.datasets.sample_down_label_space(y, k, method='most-frequent')[source]¶

Samples down label space, such that the returned label space retains order of the original labels, but removes labels which do not meet certain criteria (see method).

Parameters:	y : (sparse) array-like, shape = [n_samples, ], [n_samples, n_classes] Multi-label targets k : number Number of returned labels, has to be smaller than the number of distinct labels in y method : string, default = ‘most-frequent’ Method to sample the label space down. Currently supported is only by top k most frequent labels.