特征选择--联合方法的特征提取

"""
=================================================
Concatenating multiple feature extraction methods
=================================================

In many real-world examples, there are many ways to extract features from a
dataset. Often it is beneficial to combine several methods to obtain good
performance. This example shows how to use ``FeatureUnion`` to combine
features obtained by PCA and univariate selection.

Combining features using this transformer has the benefit that it allows
cross validation and grid searches over the whole process.

The combination used in this example is not particularly helpful on this
dataset and is only used to illustrate the usage of FeatureUnion.
"""

# Author: Andreas Mueller <[email protected]>
#
# License: BSD 3 clause

from sklearn.pipeline import Pipeline, FeatureUnion
from sklearn.grid_search import GridSearchCV
from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.decomposition import PCA
from sklearn.feature_selection import SelectKBest

iris = load_iris()

X, y = iris.data, iris.target

# This dataset is way to high-dimensional. Better do PCA:
pca = PCA(n_components=2)

# Maybe some original features where good, too?
selection = SelectKBest(k=1)

# Build estimator from PCA and Univariate selection:

combined_features = FeatureUnion([("pca", pca), ("univ_select", selection)])

# Use combined features to transform dataset:
X_features = combined_features.fit(X, y).transform(X)

# Classify:
svm = SVC(kernel="linear")
svm.fit(X_features, y)

# Do grid search over k, n_components and C:

pipeline = Pipeline([("features", combined_features), ("svm", svm)])

param_grid = dict(features__pca__n_components=[1, 2, 3],
                  features__univ_select__k=[1, 2],
                  svm__C=[0.1, 1, 10])

grid_search = GridSearchCV(pipeline, param_grid=param_grid, verbose=10)
grid_search.fit(X, y)
print(grid_search.best_estimator_)

时间： 2024-11-05 21:43:40

特征选择--联合方法的特征提取

特征选择--联合方法的特征提取的相关文章

结合Scikit-learn介绍几种常用的特征选择方法

干货：结合Scikit-learn介绍几种常用的特征选择方法

机器学习之（四）特征工程以及特征选择的工程方法

机器学习之特征选择方法

CountVectorizer方法对中文进行特征提取

特征选择常用算法综述

【转】[特征选择] An Introduction to Feature Selection 翻译

如何进行特征选择？

特征选择和特征理解（转）