class sklearn.mixture.BayesianGaussianMixture(n_components=1, covariance_type='full', tol=0.001, reg_covar=1e-06, max_iter=100, n_init=1, init_params='kmeans', weight_concentration_prior_type='dirichlet_process', weight_concentration_prior=None, mean_precision_prior=None, mean_prior=None, degrees_of_freedom_prior=None, covariance_prior=None, random_state=None, warm_start=False, verbose=0, verbose_interval=10)
[source]
Variational Bayesian estimation of a Gaussian mixture.
This class allows to infer an approximate posterior distribution over the parameters of a Gaussian mixture distribution. The effective number of components can be inferred from the data.
This class implements two types of prior for the weights distribution: a finite mixture model with Dirichlet distribution and an infinite mixture model with the Dirichlet Process. In practice Dirichlet Process inference algorithm is approximated and uses a truncated distribution with a fixed maximum number of components (called the Stick-breaking representation). The number of components actually used almost always depends on the data.
New in version 0.18.
BayesianGaussianMixture.
Read more in the User Guide.
See also
GaussianMixture
[R231] | Bishop, Christopher M. (2006). “Pattern recognition and machine learning”. Vol. 4 No. 4. New York: Springer. |
[R232] | Hagai Attias. (2000). “A Variational Bayesian Framework for Graphical Models”. In Advances in Neural Information Processing Systems 12. |
[R233] | `Blei, David M. and Michael I. Jordan. (2006). “Variational inference for Dirichlet process mixtures”. Bayesian analysis 1.1 <http://www.cs.princeton.edu/courses/archive/fall11/cos597C/reading/BleiJordan2005.pdf> |
fit (X[, y]) | Estimate model parameters with the EM algorithm. |
get_params ([deep]) | Get parameters for this estimator. |
predict (X[, y]) | Predict the labels for the data samples in X using trained model. |
predict_proba (X) | Predict posterior probability of data per each component. |
sample ([n_samples]) | Generate random samples from the fitted Gaussian distribution. |
score (X[, y]) | Compute the per-sample average log-likelihood of the given data X. |
score_samples (X) | Compute the weighted log probabilities for each sample. |
set_params (**params) | Set the parameters of this estimator. |
__init__(n_components=1, covariance_type='full', tol=0.001, reg_covar=1e-06, max_iter=100, n_init=1, init_params='kmeans', weight_concentration_prior_type='dirichlet_process', weight_concentration_prior=None, mean_precision_prior=None, mean_prior=None, degrees_of_freedom_prior=None, covariance_prior=None, random_state=None, warm_start=False, verbose=0, verbose_interval=10)
[source]
fit(X, y=None)
[source]
Estimate model parameters with the EM algorithm.
The method fit the model n_init
times and set the parameters with which the model has the largest likelihood or lower bound. Within each trial, the method iterates between E-step and M-step for max_iter
times until the change of likelihood or lower bound is less than tol
, otherwise, a ConvergenceWarning
is raised.
Parameters: |
X : array-like, shape (n_samples, n_features) List of n_features-dimensional data points. Each row corresponds to a single data point. |
---|---|
Returns: |
self : |
get_params(deep=True)
[source]
Get parameters for this estimator.
Parameters: |
deep: boolean, optional : If True, will return the parameters for this estimator and contained subobjects that are estimators. |
---|---|
Returns: |
params : mapping of string to any Parameter names mapped to their values. |
predict(X, y=None)
[source]
Predict the labels for the data samples in X using trained model.
Parameters: |
X : array-like, shape (n_samples, n_features) List of n_features-dimensional data points. Each row corresponds to a single data point. |
---|---|
Returns: |
labels : array, shape (n_samples,) Component labels. |
predict_proba(X)
[source]
Predict posterior probability of data per each component.
Parameters: |
X : array-like, shape (n_samples, n_features) List of n_features-dimensional data points. Each row corresponds to a single data point. |
---|---|
Returns: |
resp : array, shape (n_samples, n_components) Returns the probability of the sample for each Gaussian (state) in the model. |
sample(n_samples=1)
[source]
Generate random samples from the fitted Gaussian distribution.
Parameters: |
n_samples : int, optional Number of samples to generate. Defaults to 1. |
---|---|
Returns: |
X : array, shape (n_samples, n_features) Randomly generated sample |
score(X, y=None)
[source]
Compute the per-sample average log-likelihood of the given data X.
Parameters: |
X : array-like, shape (n_samples, n_dimensions) List of n_features-dimensional data points. Each row corresponds to a single data point. |
---|---|
Returns: |
log_likelihood : float Log likelihood of the Gaussian mixture given X. |
score_samples(X)
[source]
Compute the weighted log probabilities for each sample.
Parameters: |
X : array-like, shape (n_samples, n_features) List of n_features-dimensional data points. Each row corresponds to a single data point. |
---|---|
Returns: |
log_prob : array, shape (n_samples,) Log probabilities of each data point in X. |
set_params(**params)
[source]
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter>
so that it’s possible to update each component of a nested object.
Returns: | self : |
---|
sklearn.mixture.BayesianGaussianMixture
© 2007–2016 The scikit-learn developers
Licensed under the 3-clause BSD License.
http://scikit-learn.org/stable/modules/generated/sklearn.mixture.BayesianGaussianMixture.html