Distance Matrix (distmatrix)

class Orange.misc.distmatrix.DistMatrix(data, row_items=None, col_items=None, axis=1)[source]

Distance matrix. Extends numpy.ndarray.

row_items

Items corresponding to matrix rows.

col_items

Items corresponding to matrix columns.

axis

If axis=1 we calculate distances between rows, if axis=0 we calculate distances between columns.

property dim

Returns the single dimension of the symmetric square matrix.

property flat

A 1-D iterator over the array.

This is a numpy.flatiter instance, which acts similarly to, but is not a subclass of, Python's built-in iterator object.

See also

flatten

Return a copy of the array collapsed into one dimension.

flatiter

Examples

>>> x = np.arange(1, 7).reshape(2, 3)
>>> x
array([[1, 2, 3],
       [4, 5, 6]])
>>> x.flat[3]
4
>>> x.T
array([[1, 4],
       [2, 5],
       [3, 6]])
>>> x.T.flat[3]
5
>>> type(x.flat)
<class 'numpy.flatiter'>

An assignment example:

>>> x.flat = 3; x
array([[3, 3, 3],
       [3, 3, 3]])
>>> x.flat[[1,4]] = 1; x
array([[3, 1, 3],
       [3, 1, 3]])
submatrix(row_items, col_items=None)[source]

Return a submatrix

Parameters:
  • row_items -- indices of rows

  • col_items -- incides of columns

classmethod from_file(filename, sheet=None)[source]

Load distance matrix from a file

The file should be preferrably encoded in ascii/utf-8. White space at the beginning and end of lines is ignored.

The first line of the file starts with the matrix dimension. It can be followed by a list flags

  • axis=<number>: the axis number

  • symmetric: the matrix is symmetric; when reading the element (i, j) it's value is also assigned to (j, i)

  • asymmetric: the matrix is asymmetric

  • row_labels: the file contains row labels

  • col_labels: the file contains column labels

By default, matrices are symmetric, have axis 1 and no labels are given. Flags labeled and labelled are obsolete aliases for row_labels.

If the file has column labels, they follow in the second line. Row labels appear at the beginning of each row. Labels are arbitrary strings that cannot contain newlines and tabulators. Labels are stored as instances of Table with a single meta attribute named "label".

The remaining lines contain tab-separated numbers, preceded with labels, if present. Lines are padded with zeros if necessary. If the matrix is symmetric, the file contains the lower triangle; any data above the diagonal is ignored.

Parameters:

filename -- file name

has_row_labels()[source]

Returns True if row labels can be automatically determined from data

For this, the row_items must be an instance of Orange.data.Table whose domain contains a single meta attribute, which has to be a string. The domain may contain other variables, but not meta attributes.

has_col_labels()[source]

Returns True if column labels can be automatically determined from data

For this, the col_items must be an instance of Orange.data.Table whose domain contains a single meta attribute, which has to be a string. The domain may contain other variables, but not meta attributes.