metropolis是一种采样方法,一般用于获取某些拥有某些比较复杂的概率分布的样本。
1.采样最基本的是随机数的生成,一般是生成具有均匀分布的随机数,比如C++里面的rand函数,可以直接采用。
2.采样复杂一点是转化发,用于生成普通常见的分布,如高斯分布,它的一般是通过对均匀分布或者现有分布的转化形成,如高斯分布就可以通过如下方法生成:
令:U,V为均匀分布的随即变量;
那么:Gauss=sqrt(-2lnU)cos(2*pi*V)。
(详细参照Box-Muller)
像这种高斯分布matelab可以直接用randN生成。
3. 最后一种方法是类MCMC方法,其实MCMC是一种架构,这里我们说一种最简单的Metropolis方法,它的核心思想是构建马尔克夫链,该链上每一个节点都成为一个样本,利用马尔克夫链的转移和接受概率,以一定的方式对样本进行生成,这个方法简单有效,具体可以参考《LDA数学八卦》的,里面都已经进行了详细的介绍。本文给一个小demo,是利用C++写的。例外代码的matlab版本也有,但不是我写的,一并贴出来,matlab的画图功能可以视觉化的验证采样有效性,(好像matlab有个小bug),
待生成样本的概率分布为:pow(t,-3.5)*exp(-1/(2*t)) (csdn有没有想latex那种编辑方式啊??),这里在metropolis中假设马尔克夫链是symmetric,就是p(x|y)=p(y|x),整个代码如下:
#include <iostream> #include <time.h> #include <stdlib.h> #include <cmath> using namespace std; #define debug(A) std::cout<<#A<<": "<<A<<std::endl; class MCMCSimulation{ public: int N; int nn; float* sample; MCMCSimulation(){ this->N=1000; this->nn=0.1*N; this->sample=new float[N]; this->sample[0]=0.3; } void run(){ for (int i=1;i<this->N;i++){ float p_sample=this->PDF_proposal(3); float alpha=PDF(p_sample)/PDF(this->sample[i-1]); if(PDF_proposal(1) < this->min(alpha,1)){ this->sample[i]=p_sample; }else{ this->sample[i]=this->sample[i-1]; } //debug(p_sample); //debug(PDF_proposal(1)); //debug(sample[i]); cout<<sample[i]<<endl; } } private: //the probability desity function float PDF(float t){ return pow(t,-3.5)*exp(-1/(2*t)); } //generate a rand number from 0-MAX, with a specified precision float PDF_proposal(int MAX){ //srand return (float)(MAX)*(float)(rand())/(float)(RAND_MAX); } float min(float x, float y){ if(x<y){ return x; }else{ return y; } } }; int main(){ MCMCSimulation* sim =new MCMCSimulation(); sim->run(); delete sim; return 1; }
matlab版本
%% Simple Metropolis algorithm example %{ --------------------------------------------------------------------------- *Created by: Date: Comment: Felipe Uribe-Castillo July 2011 Final project algorithm *Mail: [email protected] *University: National university of Colombia at Manizales. Civil Engineering Dept. --------------------------------------------------------------------------- The Metropolis algorithm: First MCMC to obtain samples from some complex probability distribution and to integrate very complex functions by random sampling. It considers only symmetric proposals (proposal pdf). Original contribution: -"The Monte Carlo method" Metropolis & Ulam (1949). -"Equations of State Calculations by Fast Computing Machines" Metropolis, Rosenbluth, Rosenbluth, Teller & Teller (1953). --------------------------------------------------------------------------- Based on: 1."Markov chain Monte Carlo and Gibbs sampling" B.Walsh ----- Lecture notes for EEB 581, version april 2004. http://web.mit.edu/~wingated/www/introductions/mcmc-gibbs-intro.pdf %} clear; close all; home; format long g; %% Initial data % Distributions nu = 5; % Target PDF parameter p = @(t) (t.^(-nu/2-1)).*(exp(-1./(2*t))); % Target "PDF", Ref. 1 - Ex. 2 proposal_PDF = @(mu) unifrnd(0,3); % Proposal PDF % Parameters N = 1000; % Number of samples (iterations) nn = 0.1*(N); % Number of samples for examine the AutoCorr theta = zeros(1,N); % Samples drawn form the target "PDF" theta(1) = 0.3; % Initial state of the chain %% Procedure for i = 1:N theta_ast = proposal_PDF([]); % Sampling from the proposal PDF alpha = p(theta_ast)/p(theta(i)); % Ratio of the density at theta_ast and theta points if rand <= min(alpha,1) % Accept the sample with prob = min(alpha,1) theta(i+1) = theta_ast; else % Reject the sample with prob = 1 - min(alpha,1) theta(i+1) = theta(i); end end % Autocorrelation (AC) pp = theta(1:nn); pp2 = theta(end-nn:end); % First ans Last nn samples [r lags] = xcorr(pp-mean(pp), 'coeff'); [r2 lags2] = xcorr(pp2-mean(pp2), 'coeff'); %% Plots % Autocorrelation figure; subplot(2,1,1); stem(lags,r); title('Autocorrelation', 'FontSize', 14); ylabel('AC (first samples)', 'FontSize', 12); subplot(2,1,2); stem(lags2,r2); ylabel('AC (last samples)', 'FontSize', 12); % Samples and distributions xx = eps:0.01:10; % x-axis (Graphs) figure; % Histogram and target dist subplot(2,1,1); [n1 x1] = hist(theta, ceil(sqrt(N))); bar(x1,n1/(N*(x1(2)-x1(1)))); colormap summer; hold on; % Normalized histogram plot(xx, p(xx)/trapz(xx,p(xx)), 'r-', 'LineWidth', 3); % Normalized function grid on; xlim([0 max(theta)]); title('Distribution of samples', 'FontSize', 14); ylabel('Probability density function', 'FontSize', 12); % Samples subplot(2,1,2); plot(theta, 1:N+1, 'b-'); xlim([0 max(theta)]); ylim([0 N]); grid on; xlabel('Location', 'FontSize', 12); ylabel('Iterations, N', 'FontSize', 12); %%End
下面是样本的分布图
时间: 2024-10-28 12:11:08