参考文献:On Discriminative vs. Generative classifiers: A comparison of logistic regression and naive Bayes
生成式模型:model p(x,y)=p(x|y)*p(y) -> Bayes rule预测: p(y|x)=p(x,y)p(x),代表模型:Naive Bayes
判别式模型:model p(y|x),代表模型:Logistic Regression
参考文献中的结论:
判别式模型有更低的理论渐近误差[the generative model does indeed have a higher asymptotic error - as the number of training examples becomes large - than the discriminative model],
生成式模型理论上更快逼近渐近误差(前提是样本能够满足条件独立性和特定的分布,比如Gaussian分布)[but the generative model may also approach its asymptotic error much faster than the discriminative model - possibly with a number of training examples that is only logarithmic, rather than linear, in the number of parameters]
实际情况由于样本很难严格服从特定条件,使得判别式模型往往更优。
其他来源的观点:
- Easy to fit?
G: easy, simple counting and averaging (NB, LDA)
D: much slower, solving a convex optimization problem (LogR)
- Fit classes separately?
G: not have to retrain when add more classes
D: must be retrained (all parameters interact)
- Handle missing features easily?
G: simple, marginalizing them out (NB)
D: no principled solution, model assumes that x is given
- Can handle feature preprocessing?
G: hard to define model on preprocessed data
D: allow to preprocess the input, replace x with kernel(x)
- Can handle unlabeled training data (like semi-supervised learning)?
G: easy
D: much harder
原文地址:https://www.cnblogs.com/yaoyaohust/p/10007481.html