从零单排入门机器学习：线性回归（linear regression）实践篇

线性回归（linear regression）实践篇

之前一段时间在coursera看了Andrew ng的机器学习的课程，感觉还不错，算是入门了。

这次打算以该课程的作业为主线，对机器学习基本知识做一下总结。小弟才学疏浅，如有错误。敬请指导。

问题原描写叙述：

you will implement linear regression with one
variable to predict prots for a food truck. Suppose you are the CEO of a
restaurant franchise and are considering dierent cities for opening a new
outlet. The chain already has trucks in various cities and you have data for
prots and populations from the cities.

简单来说，就是依据一个城市的人口数量，来预測一辆快餐车能获得的利益。

数据集大概是这样子的：

一行数据为一个样本。第一列表示人口，第二列表示利益。

首先。先把数据可视化。

%% ======================= Part 2: Plotting =======================
fprintf(‘Plotting Data ...\n‘)
data = load(‘ex1data1.txt‘);
X = data(:, 1); y = data(:, 2);
m = length(y); % number of training examples

% Plot Data
% Note: You have to complete the code in plotData.m
plotData(X, y);

fprintf(‘Program paused. Press enter to continue.\n‘);
pause;

function plotData(x, y)
%PLOTDATA Plots the data points x and y into a new figure
%   PLOTDATA(x,y) plots the data points and gives the figure axes labels of
%   population and profit.

% ====================== YOUR CODE HERE ======================
% Instructions: Plot the training data into a figure using the
%               "figure" and "plot" commands. Set the axes labels using
%               the "xlabel" and "ylabel" commands. Assume the
%               population and revenue data have been passed in
%               as the x and y arguments of this function.
%
% Hint: You can use the ‘rx‘ option with plot to have the markers
%       appear as red crosses. Furthermore, you can make the
%       markers larger by using plot(..., ‘rx‘, ‘MarkerSize‘, 10);

figure; % open a new figure window

plot(x, y, ‘rx‘, ‘MarkerSize‘, 10); % Plot the data
ylabel(‘Profit in $10,000s‘); % Set the y label
xlabel(‘Population of City in 10,000s‘); % Set the x label

% ============================================================

end

计算cost function

function J = computeCost(X, y, theta)
%COMPUTECOST Compute cost for linear regression
%   J = COMPUTECOST(X, y, theta) computes the cost of using theta as the
%   parameter for linear regression to fit the data points in X and y

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
%               You should set J to the cost.
H = X*theta;
diff = H - y;
%J = sum(diff.^2)/(2*m);
J = sum(diff.*diff)/(2*m);

% =========================================================================

end

为了方便理解上面代码，看看各变量大概长什么样子的。

梯度下降法计算參数theta

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
%   theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by
%   taking num_iters gradient steps with learning rate alpha

% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);

for iter = 1:num_iters

    % ====================== YOUR CODE HERE ======================
    % Instructions: Perform a single gradient step on the parameter vector
    %               theta.
    %
    % Hint: While debugging, it can be useful to print out the values
    %       of the cost function (computeCost) and gradient here.
    %

    H = X*theta-y;
    theta(1) = theta(1) - sum(H.* X(:,1))*alpha/m;%感觉这样写挺搓的
    theta(2) = theta(2) - sum(H.* X(:,2))*alpha/m;
    %theta = theta - alpha * (X‘ * (X * theta - y)) / m; 

    % ============================================================

    % Save the cost J in every iteration
    J_history(iter) = computeCost(X, y, theta);

end

end

难以理解的是theta = theta - alpha * (X‘ * (X * theta - y)) / m; 这样的向量化算法。

先看看theta本质是怎么计算的

再看看各变量长什么样子的

算出theta之后，就能够画出拟合直线了。

注：本文作者linger，如有转载。请标明转载于http://blog.csdn.net/lingerlanlan。

本文链接:http://blog.csdn.net/lingerlanlan/article/details/32162559

时间： 2025-01-10 23:06:19

从零单排入门机器学习：线性回归（linear regression）实践篇的相关文章

从零单排入门机器学习：Octave/matlab的常用知识之矩阵和向量

Octave/matlab的常用知识之矩阵和向量之前一段时间在coursera看了Andrew ng的机器学习的课程,感觉还不错,算是入门了.这次打算以该课程的作业为主线,对机器学习基本知识做一下总结.小弟才学疏浅,如有错误,敬请指导. Andrew的课程选了Octave/matlab为编程语言.他选择这个估计更多是考虑大众性,这门语言容易入门. 然后我觉得学会使用Octave/matlab还是挺有用的.一来是她天生是个数学工具,开始的研究阶段使用她最方便莫属,注意我这里所说的是研究阶段,商用

从零单排入门机器学习：OctaveMatlab的常用知识之画图

OctaveMatlab的常用知识之画图之前一段时间在coursera看了Andrew ng的机器学习的课程,感觉还不错,算是入门了.这次打算以该课程的作业为主线,对机器学习基本知识做一下总结.小弟才学疏浅,如有错误,敬请指导. 看几个例子即可. 'ro'的含义,r表示红色,o表示o形. 可以记住某些常用的选项. 一条命令画两条线.其实可以多条. 一般来说,用户在新输入plot 命令之后,原图片上的内容将被删除.如果你想保存当前的图片内容并将新创建的图片层叠到原图片上,你可以通过使用hold

Stanford机器学习---第二讲. 多变量线性回归 Linear Regression with multiple variable

原文:http://blog.csdn.net/abcjennifer/article/details/7700772 本栏目(Machine learning)包括单参数的线性回归.多参数的线性回归.Octave Tutorial.Logistic Regression.Regularization.神经网络.机器学习系统设计.SVM(Support Vector Machines 支持向量机).聚类.降维.异常检测.大规模机器学习等章节.所有内容均来自Standford公开课machine

机器学习 Machine Learning（by Andrew Ng）----第二章单变量线性回归(Linear Regression with One Variable)

第二章单变量线性回归(Linear Regression with One Variable) <模型表示(Model Representation)> <代价函数(Cost Function)> <梯度下降(Gradient Descent)

机器学习 (一) 单变量线性回归 Linear Regression with One Variable

文章内容均来自斯坦福大学的Andrew Ng教授讲解的Machine Learning课程,本文是针对该课程的个人学习笔记,如有疏漏,请以原课程所讲述内容为准.感谢博主Rachel Zhang和 JerryLead 的个人笔记,为我做个人学习笔记提供了很好的参考和榜样. § 1. 单变量线性回归 Linear Regression with One Variable 1. 代价函数Cost Function 在单变量线性回归中,已知有一个训练集有一些关于x.y的数据(如×所示),当我们的预测值

机器学习笔记01：线性回归(Linear Regression)和梯度下降(Gradient Decent)

最近在Coursera上看吴大神的Machine Learning,感觉讲的真的很棒.所以觉得应该要好好做做笔记,一方面是加强自己对ML中一些方法的掌握程度和理解,另一方面也能方便自己或者同样爱好ML的同学. 线性回归(Linear Regression) 线性回归(Linear Regression)应该是机器学习中最基本的东西了.所谓回归,想必大家在高中时期的课程里面就接触过,给定一系列离散的点(x0,y0),求一条直线 f(x)=ax+b 以使得最小.在machine learning 中

Stanford公开课机器学习---2.单变量线性回归(Linear Regression with One Variable)

单变量线性回归(Linear Regression with One Variable) 2.1 模型表达(Model Representation) m 代表训练集中实例的数量 x 代表特征/输入变量 y 代表目标变量/输出变量 (x,y) 代表训练集中的实例 (x(i),y(i) ) 代表第 i 个观察实例 h 代表学习算法的解决方案或函数也称为假设(hypothesis) 单变量线性回归:只含有一个特征/输入变量 x hθ=θ0+θ1x 2.2 代价函数(Cost Function) 目标

机器学习方法（一）：线性回归Linear regression

开一个机器学习方法科普系列,也做基础回顾之用.学而时习之. content: linear regression, Ridge, Lasso Logistic Regression, Softmax Kmeans, GMM, EM, Spectral Clustering Dimensionality Reduction: PCA.LDA.Laplacian Eigenmap. LLE. Isomap(修改前面的blog) SVM C3.C4.5 Apriori,FP PageRank minH

机器学习 1 linear regression 作业

机器学习 1 linear regression 作业话说学机器学习,不写代码就太扯淡了.好了,接着上一次的线性回归作业. hw1作业的链接在这: http://speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/hw1.pdf 作业是预测台湾的PM2.5的指数,既然是回归问题,肯定是用的是上一节课的线性回归了. 以上数据我传到https://pan.baidu.com/s/1dFhwT13 上面了,供有兴趣的人做做. 实际上上述中分为训练