CCJ PRML Study Note - Chapter 1 Summary : MLE (Maximum-likelihood Estimate) and Bayesian Approach

Chapter 1 Summary : MLE (Maximum-likelihood Estimate) and Bayesian Approach

Chapter 1 Summary : MLE (Maximum-likelihood Estimate) and Bayesian Approach

Christopher M. Bishop, PRML, Chapter 1 Introdcution

1. Notations and Logical Relation

  • Training data: input values and their corresponding target values . For simplicity, written as .
  • Goal of Making Prediction: to be able to make predictions for the target variable given some new value of the input variable .
  • Assumption of the predictive distribution over : we shall assume that, given the value of , the corresponding value of has a Gaussian distribution with a mean equal to the value y(x, w) of the polynomial curve given by (1.1). Thus we have
  • Likelihood function of i.i.d. training data :
  • MLE of parameters and :
    • for linear regression

    • :
  • ML plugin prediction for new values of : substituting the maximum likelihood parameters into (1.60) to give

  • Prior distribution over : For simplicity, let us consider a Gaussian distribution of the form where

    • hyperparameter is the precision of the distribution,
    • M +1 is the total number of elements in the vector for an order polynomial.
  • Posterior distribution for : using Bayes’ Theorem,
  • MAP: a step towards a more Bayesian approach, note MAP is still a point estimate. We find that the maximum of the posterior is given by the minimum of

Although we have included a prior distribution , we are so far still making a point estimate of and so this does not yet amount to a Bayesian treatment. In a fully Bayesian approach, we should consistently apply the sum and product rules of probability, which requires, as we shall see shortly, that we integrate over all values of w. Such marginalizations lie at the heart of Bayesian methods for pattern recognition.

  • Fully Bayesian approach:

    • Here we shall assume that the parameters and are fixed and known in advance (in later chapters we shall discuss how such parameters can be inferred from data in a Bayesian setting).
    • A Bayesian treatment simply corresponds to a consistent application of the sum and product rules of probability, which allow the predictive distribution to be written in the form

  • Result of Integration in (1.68):
    • (1.66): this posterior distribution is a Gaussian and can be evaluated analytically.
    • (1.68) can also be performed analytically with the result that the predictive distribution is given by a Gaussian of the form where the mean and variance are given by Here the matrix S is given by where is the unit matrix, and we have defined the vector with elements for .

2. Flowchart

The relation between all of those equations or notions above:

时间: 2024-10-15 01:23:16

CCJ PRML Study Note - Chapter 1 Summary : MLE (Maximum-likelihood Estimate) and Bayesian Approach的相关文章

CCJ PRML Study Note - Chapter 1.2 : Probability Theory

Chapter 1.2 : Probability Theory Chapter 1.2 : Probability Theory Christopher M. Bishop, PRML, Chapter 1 Introdcution Chapter 1.2 : Probability Theory 1. Uncertainty 2. Example discussed through this chapter 3. Basic Terminology 3.1 Probability densi

CCJ PRML Study Note - Chapter 1.5 : Decision Theory

Chapter 1.5 : Decision Theory Chapter 1.5 : Decision Theory Christopher M. Bishop, PRML, Chapter 1 Introdcution 1. PRML所需要的三论: Probability theory: provides us with a consistent mathematical framework for quantifying and manipulating uncertainty. Deci

CCJ PRML Study Note - Chapter 1.6 : Information Theory

Chapter 1.6 : Information Theory Chapter 1.6 : Information Theory Christopher M. Bishop, PRML, Chapter 1 Introdcution 1. Information h(x) Given a random variable and we ask how much information is received when we observe a specific value for this va

CCJ PRML Study Note - Chapter 1.3-1.4 : Model Selection & the Curse of Dimensionality

Chapter 1.3-1.4 : Model Selection & the Curse of Dimensionality Chapter 1.3-1.4 : Model Selection & the Curse of Dimensionality Christopher M. Bishop, PRML, Chapter 1 Introdcution 1. Model Selection In our example of polynomial curve fitting using

最大似然估计实例 | Fitting a Model by Maximum Likelihood (MLE)

参考:Fitting a Model by Maximum Likelihood 最大似然估计是用于估计模型参数的,首先我们必须选定一个模型,然后比对有给定的数据集,然后构建一个联合概率函数,因为给定了数据集,所以该函数就是以模型参数为自变量的函数,通过求导我们就能得到使得该函数值(似然值)最大的模型参数了. Maximum-Likelihood Estimation (MLE) is a statistical technique for estimating model parameters

C++ Primer Study Note 系列[1]-chapter1快速入门

I want to study it all the time , and now I am ready to study this book in the next mouth. Time : 2014/07/02 先看一个程序体验一下: #include <iostream> int main() {     /*This is a test example*/     std::cout << "Enter two numbers:" << s

My Study Note of JDBC (1)

# JDBC -- The Java™ Tutorials # Study Note of JDBC# victor# 2016.05.31 JDBC Study Note ----connect to database 通常,使用JDBC执行SQL语句需要下面5 个步骤:1> 建立一个连接         | establish a connection2> 构造一条语句         | create a statement3> 执行语句             | execute

CoderPig’s Android Study Note——目录

CoderPig's Android Study Note--目录 前言 雏形,有时间慢慢完善,内容也是,有时间就写- 一.概念与开发辅助工具 1)概念性的东西 1.1 背景相关与系统架构 2)开发辅助工具 1.2.1 APK反编译工具之:apktool + dex2jar + jd-gui 1.2.1 APK反编译工具之:jadx 1.2.2 网络抓包工具之:Charles 1.2.2 网络抓包工具之:Fiddler 3)网页工具/插件 1.3.1 XML布局代码自动生成findViewByI

HotSpot JVM and GC basics study note

Hotspot JVM and GC basics study note JVM components HotSpot JVM comprises three main components: the class loader, the runtime data areas and the execution engine. Key JVM components There are three key components related to tune performance: the hea