Outlier detection (`classification`)¶

One Class Support Vector Machines¶

class Orange.classification.OneClassSVMLearner(kernel='rbf', degree=3, gamma='auto', coef0=0.0, tol=0.001, nu=0.5, shrinking=True, cache_size=200, max_iter=-1, preprocessors=None)[source]¶

A wrapper for sklearn.svm._classes.OneClassSVM. The following is its documentation:

Unsupervised Outlier Detection.

Estimate the support of a high-dimensional distribution.

The implementation is based on libsvm.

Elliptic Envelope¶

class Orange.classification.EllipticEnvelopeLearner(store_precision=True, assume_centered=False, support_fraction=None, contamination=0.1, random_state=None, preprocessors=None)[source]¶

A wrapper for sklearn.covariance._elliptic_envelope.EllipticEnvelope. The following is its documentation:

An object for detecting outliers in a Gaussian distributed dataset.

Local Outlier Factor¶

class Orange.classification.LocalOutlierFactorLearner(n_neighbors=20, algorithm='auto', leaf_size=30, metric='minkowski', p=2, metric_params=None, contamination='auto', novelty=True, n_jobs=None, preprocessors=None)[source]¶

A wrapper for sklearn.neighbors._lof.LocalOutlierFactor. The following is its documentation:

Unsupervised Outlier Detection using the Local Outlier Factor (LOF).

The anomaly score of each sample is called the Local Outlier Factor. It measures the local deviation of the density of a given sample with respect to its neighbors. It is local in that the anomaly score depends on how isolated the object is with respect to the surrounding neighborhood. More precisely, locality is given by k-nearest neighbors, whose distance is used to estimate the local density. By comparing the local density of a sample to the local densities of its neighbors, one can identify samples that have a substantially lower density than their neighbors. These are considered outliers.

Added in version 0.19.

Isolation Forest¶

class Orange.classification.IsolationForestLearner(n_estimators=100, max_samples='auto', contamination='auto', max_features=1.0, bootstrap=False, n_jobs=None, behaviour='deprecated', random_state=None, verbose=0, warm_start=False, preprocessors=None)[source]¶

A wrapper for sklearn.ensemble._iforest.IsolationForest. The following is its documentation:

Isolation Forest Algorithm.

Return the anomaly score of each sample using the IsolationForest algorithm

The IsolationForest 'isolates' observations by randomly selecting a feature and then randomly selecting a split value between the maximum and minimum values of the selected feature.

Since recursive partitioning can be represented by a tree structure, the number of splittings required to isolate a sample is equivalent to the path length from the root node to the terminating node.

This path length, averaged over a forest of such random trees, is a measure of normality and our decision function.

Random partitioning produces noticeably shorter paths for anomalies. Hence, when a forest of random trees collectively produce shorter path lengths for particular samples, they are highly likely to be anomalies.

Outlier detection (`classification`)¶

One Class Support Vector Machines¶

Elliptic Envelope¶

Local Outlier Factor¶

Isolation Forest¶

Orange Data Mining Library

Navigation

Related Topics

Outlier detection (classification)¶

One Class Support Vector Machines¶

Elliptic Envelope¶

Local Outlier Factor¶

Isolation Forest¶

Outlier detection (`classification`)¶