1.Structured prediction methods are essentially a combination of classification and graphical modeling.
2.They combine the ability of graphical models to compactly model multivariate data with the ability of classification methods to perform prediction using large sets of input features.
3.The input x is divided into feature vectors {x0,x1, . . . ,xT }. Each xs contains various information about the word at position s, such as its identity, orthographic features such as prefixes and suffixes, membership in domain-specific lexicons, and information in semantic databases such as WordNet.
4.CRFs are essentially a way of combining the advantages of discriminative classification and graphical modeling, combining the ability to compactly model multivariate outputs y with the ability to leverage a large number of input features x for prediction.
5.The difference between generative models and CRFs is thus exactly analogous to the difference between the naive Bayes and logistic regression classifiers. Indeed, the multinomial logistic regression model can be seen as the simplest kind of CRF, in which there is only one output variable.
a introduction to conditional random fields