sklearn.metrics.fowlkes_mallows_score(labels_true, labels_pred, sparse=False)
[source]
Measure the similarity of two clusterings of a set of points.
The Fowlkes-Mallows index (FMI) is defined as the geometric mean between of the precision and recall:
FMI = TP / sqrt((TP + FP) * (TP + FN))
Where TP
is the number of True Positive (i.e. the number of pair of points that belongs in the same clusters in both labels_true
and labels_pred
), FP
is the number of False Positive (i.e. the number of pair of points that belongs in the same clusters in labels_true
and not in labels_pred
) and FN
is the number of False Negative (i.e the number of pair of points that belongs in the same clusters in labels_pred
and not in labels_True
).
The score ranges from 0 to 1. A high value indicates a good similarity between two clusters.
Read more in the User Guide.
Parameters: |
labels_true : int array, shape = ( A clustering of the data into disjoint subsets. labels_pred : array, shape = ( A clustering of the data into disjoint subsets. |
---|---|
Returns: |
score : float The resulting Fowlkes-Mallows score. |
[R208] | E. B. Fowkles and C. L. Mallows, 1983. “A method for comparing two hierarchical clusterings”. Journal of the American Statistical Association |
[R209] | Wikipedia entry for the Fowlkes-Mallows Index |
Perfect labelings are both homogeneous and complete, hence have score 1.0:
>>> from sklearn.metrics.cluster import fowlkes_mallows_score >>> fowlkes_mallows_score([0, 0, 1, 1], [0, 0, 1, 1]) 1.0 >>> fowlkes_mallows_score([0, 0, 1, 1], [1, 1, 0, 0]) 1.0
If classes members are completely split across different clusters, the assignment is totally random, hence the FMI is null:
>>> fowlkes_mallows_score([0, 0, 0, 0], [0, 1, 2, 3]) 0.0
© 2007–2016 The scikit-learn developers
Licensed under the 3-clause BSD License.
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.fowlkes_mallows_score.html