Linear regression with multiple variables(多特征的线型回归)算法实例_梯度下降解法(Gradient DesentMulti)以及正规方程解法(Normal Equation)

%第一列为 size of House(feet^2),第二列为 number of bedroom，第三列为 price of House 1 2104,3,399900
 2 1600,3,329900
 3 2400,3,369000
 4 1416,2,232000
 5 3000,4,539900
 6 1985,4,299900
 7 1534,3,314900
 8 1427,3,198999
 9 1380,3,212000
10 1494,3,242500
11 1940,4,239999
12 2000,3,347000
13 1890,3,329999
14 4478,5,699900
15 1268,3,259900
16 2300,4,449900
17 1320,2,299900
18 1236,3,199900
19 2609,4,499998
20 3031,4,599000
21 1767,3,252900
22 1888,2,255000
23 1604,3,242900
24 1962,4,259900
25 3890,3,573900
26 1100,3,249900
27 1458,3,464500
28 2526,3,469000
29 2200,3,475000
30 2637,3,299900
31 1839,2,349900
32 1000,1,169900
33 2040,4,314900
34 3137,3,579900
35 1811,4,285900
36 1437,3,249900
37 1239,3,229900
38 2132,4,345000
39 4215,4,549000
40 2162,4,287000
41 1664,2,368500
42 2238,3,329900
43 2567,4,314000
44 1200,3,299000
45 852,2,179900
46 1852,4,299900
47 1203,3,239500

 1 %  Exercise 1: Linear regression with multiple variables
 2
 3 %% Initialization
 4
 5 %% ================ Part 1: Feature Normalization ================
 6
 7 %% Clear and Close Figures
 8 clear ; close all; clc
 9
10 fprintf(‘Loading data ...\n‘);
11
12 %% Load Data
13 data = load(‘ex1data2.txt‘);
14 X = data(:, 1:2);
15 y = data(:, 3);
16 m = length(y);
17
18 % Print out some data points
19 fprintf(‘First 10 examples from the dataset: \n‘);
20 fprintf(‘ x = [%.0f %.0f], y = %.0f \n‘, [X(1:10,:) y(1:10,:)]‘);
21
22 fprintf(‘Program paused. Press enter to continue.\n‘);
23 pause;
24
25 % Scale features and set them to zero mean
26 fprintf(‘Normalizing Features ...\n‘);
27
28 [X, mu, sigma] = featureNormalize(X);

 1 %featureNormalize(X)函数实现
 2 function [X_norm, mu, sigma] = featureNormalize(X)
 3 X_norm = X;                      % X是需要正规化的矩阵
 4 mu = zeros(1, size(X, 2));       % 生成 1x3 的全0矩阵
 5 sigma = zeros(1, size(X, 2));    % 同上
 6
 7 % Instructions: First, for each feature dimension, compute the mean
 8 %               of the feature and subtract it from the dataset,
 9 %               storing the mean value in mu. Next, compute the
10 %               standard deviation of each feature and divide
11 %               each feature by it‘s standard deviation, storing
12 %               the standard deviation in sigma.
13 %
14 %               Note that X is a matrix where each column is a
15 %               feature and each row is an example. You need
16 %               to perform the normalization separately for
17 %               each feature.
18 %
19 % Hint: You might find the ‘mean‘ and ‘std‘ functions useful.
20
21 % std，均方差，std(X,0,1)求列向量方差，std(X,0,2)求行向量方差。
22
23 mu = mean(X, 1);                 %求每列的均值--即一种特征的所有样本的均值
24 sigma = std(X);                  %默认同std(X,0,1)求列向量方差
25 %fprintf(‘Debug....\n‘); disp(sigma);
26 i = 1;
27 len = size(X,2);                 %行数
28 while i <= len,
29     %对每列的所有行上的样本进行normalization(归一化):(每列的所有行-该列均值)/(该列的标准差)
30     X_norm(:,i) = (X(:,i) - mu(1,i)) / (sigma(1,i));
31     i = i + 1;
32 end

 1 % Add intercept term to X
 2 X = [ones(m, 1) X];    
 3
 4
 5 %% ================ Part 2: Gradient Descent ================
 6
 7 % ====================== YOUR CODE HERE ======================
 8 % Instructions: We have provided you with the following starter
 9 %               code that runs gradient descent with a particular
10 %               learning rate (alpha).
11 %
12 %               Your task is to first make sure that your functions -
13 %               computeCost and gradientDescent already work with
14 %               this starter code and support multiple variables.
15 %
16 %               After that, try running gradient descent with
17 %               different values of alpha and see which one gives
18 %               you the best result.
19 %
20 %               Finally, you should complete the code at the end
21 %               to predict the price of a 1650 sq-ft, 3 br house.
22 %
23 % Hint: By using the ‘hold on‘ command, you can plot multiple
24 %       graphs on the same figure.
25 %
26 % Hint: At prediction, make sure you do the same feature normalization.
27 %
28
29 fprintf(‘Running gradient descent ...\n‘);
30
31 % Choose some alpha value
32 alpha = 0.03;                         % learning rate - 可尝试0.01,0.03,0.1,0.3...
33 num_iters = 400;                      % 迭代次数
34
35 % Init Theta and Run Gradient Descent
36 theta = zeros(3, 1);                  % 3x1的全零矩阵
37 [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters);

% gradientDescentMulti()函数实现 1 function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters)
 2  3 %   theta = GRADIENTDESCENTMULTI(x, y, theta, alpha, num_iters) updates theta by
 4 %   taking num_iters gradient steps with learning rate alpha
 5
 6 % Initialize some useful values
 7 m = length(y);                    % number of training examples
 8 feature_number = size(X,2);       % number of feature
 9
10 J_history = zeros(num_iters, 1);
11 temp = zeros(feature_number, 1);
12
13 for iter = 1 : num_iters
14     predictions = X * theta;
15     sqrError = (predictions - y);
16     for i = 1 : feature_number    % Simultneously update theta(i) （同时更新）
17         temp(i) = theta(i) - (alpha / m) * sum(sqrError .* X(:,i));
18     end
19
20     for j = 1 : feature_number
21         theta(j) = temp(j);
22     end
23
24     % ====================== YOUR CODE HERE ======================
25     % Instructions: Perform a single gradient step on the parameter vector
26     %               theta.
27     %
28     % Hint: While debugging, it can be useful to print out the values
29     %       of the cost function (computeCostMulti) and gradient here.
30     %
31
32     % ============================================================
33
34     % Save the cost J in every iteration
35     J_history(iter) = computeCostMulti(X, y, theta);
36     % disp(J_history(iter));
37
38 end
39
40 end

 1 % Plot the convergence graph
 2 figure;
 3 plot(1:numel(J_history), J_history, ‘-b‘, ‘LineWidth‘, 2); % ‘-b‘--用蓝线绘制图像,线宽为2
 4 xlabel(‘Number of iterations‘);
 5 ylabel(‘Cost J‘);
 6
 7 % Display gradient descent‘s result
 8 fprintf(‘Theta computed from gradient descent: \n‘);
 9 fprintf(‘ %f \n‘, theta);
10 fprintf(‘\n‘);

Tip:To compare how dierent learning learning
rates aect convergence, it‘s helpful to plot J for several learning rates
on the same gure. In Octave/MATLAB, this can be done by perform-
ing gradient descent multiple times with a `hold on‘ command between
plots. Concretely, if you‘ve tried three dierent values of alpha (you should
probably try more values than this) and stored the costs in J1, J2 and
J3, you can use the following commands to plot them on the same gure:
plot(1:50, J1(1:50), `b‘);
hold on;
plot(1:50, J2(1:50), `r‘);
plot(1:50, J3(1:50), `k‘);
The nal arguments `b‘, `r‘, and `k‘ specify dierent colors for the
plots.

 1 % 如，可以添加本段代码进行比较 不同的learning rate
 2 figure;
 3 plot(1:100, J_history(1:100), ‘-b‘, ‘LineWidth‘, 2);
 4 xlabel(‘Number of iterations‘);
 5 ylabel(‘Cost J‘);
 6
 7 % Compare learning rate
 8 hold on;
 9 alpha = 0.03;
10 theta = zeros(3, 1);
11 [theta, J_history1] = gradientDescentMulti(X, y, theta, alpha, num_iters);
12 plot(1:100, J_history1(1:100), ‘r‘, ‘LineWidth‘, 2);
13
14 hold on;
15 alpha = 0.1;
16 theta = zeros(3, 1);
17 [theta, J_history2] = gradientDescentMulti(X, y, theta, alpha, num_iters);
18 plot(1:100, J_history2(1:100), ‘g‘, ‘LineWidth‘, 2);

 1 % 利用梯度下降算法预测新值
 2 price = [1, X(1:2)] * theta;   %利用矩阵乘法--预测多特征下的price
 3
 4 % ============================================================
 5
 6 fprintf([‘Predicted price of a 1650 sq-ft, 3 br house ‘ ...
 7          ‘(using gradient descent):\n $%f\n‘], price);
 8
 9 fprintf(‘Program paused. Press enter to continue.\n‘);
10 pause;

 1 %% ================ Part 3: Normal Equations ================
 2 %利用正规方程预测新值(Normal Equation)
 3 fprintf(‘Solving with normal equations...\n‘);
 4
 5 %% Load Data
 6 data = csvread(‘ex1data2.txt‘);
 7 X = data(:, 1:2);
 8 y = data(:, 3);
 9 m = length(y);
10
11 % Add intercept term to X
12 X = [ones(m, 1) X];
13
14 % Calculate the parameters from the normal equation
15 theta = normalEqn(X, y);

 % normalEquation的实现 1 function [theta] = normalEqn(X, y)
 2
 3 theta = zeros(size(X, 2), 1);
 4
 6 % Instructions: Complete the code to compute the closed form solution
 7 %               to linear regression and put the result in theta.
 8
 9 theta = pinv(X‘ * X) * X‘ * y;
10
11 end

 1 % Display normal equation‘s result
 2 fprintf(‘Theta computed from the normal equations: \n‘);
 3 fprintf(‘ %f \n‘, theta);
 4 fprintf(‘\n‘);
 5
 6
 7 % Estimate the price of a 1650 sq-ft, 3 br house
 8
 9 price = 0;
10 price = [1, X(1:2)] * theta;    %利用正规方程预测新值
11
12
13 fprintf([‘Predicted price of a 1650 sq-ft, 3 br house ‘ ...
14          ‘(using normal equations):\n $%f\n‘], price);

时间： 2024-10-21 18:01:26

Linear regression with multiple variables(多特征的线型回归)算法实例_梯度下降解法(Gradient DesentMulti)以及正规方程解法(Normal Equation)的相关文章

Coursera公开课机器学习：Linear Regression with multiple variables

多特征实际上我们真正买过房子的都知道,在选择房子的时候,需要考虑的不仅仅是面积,地段.结构.房龄.邻里关系之类的都应该是考虑对象,所以前面几讲谈论的,单纯用面积来谈房价,不免失之偏颇. 多考虑些特性我们加入一些特性来考虑房价问题: 符号解释 $n$:特性数目 $x ^{(i)}$:输入的第$i$个训练数据 $x ^{(i)} _j$:第$i$个训练数据的第$j$个特性 $h _\theta (x)$ 相应的,$h _\theta (x)$也就变了: $h _\theta (x) = \the

多元线性回归(Linear Regression with multiple variables)与最小二乘(least squat)

1.线性回归介绍 X指训练数据的feature,beta指待估计得参数. 详细见http://zh.wikipedia.org/wiki/%E4%B8%80%E8%88%AC%E7%BA%BF%E6%80%A7%E6%A8%A1%E5%9E%8B 使用最小二乘法拟合的普通线性回归是数据建模的基本方法. 令最小二乘项的偏导为0(为0时RSS项最小),求Beta估计值,得到最小二乘的向量形式. 最小二乘其实就是找出一组参数beta使得训练数据到拟合出的数据的欧式距离最小.如下图所示,使所有红点(训练

Machine Learning：Linear Regression With Multiple Variables

Machine Learning:Linear Regression With Multiple Variables 接着上次预测房子售价的例子,引出多变量的线性回归. 在这里我们用向量的表示方法使表达式更加简洁. 变量梯度下降跟单变量一样需同步更新所有的theta值. 进行feature scaling的原因是为了使gradient descent算法收敛速度加快.如下图所示,左图theta2与theta1的量级相差太大,这样导致Cost Function的等高图为一个细高的椭圆形状,可以看到

Machine Learning - IV. Linear Regression with Multiple Variables (Week 2)

http://blog.csdn.net/pipisorry/article/details/43529845 机器学习Machine Learning - Andrew NG courses学习笔记 multivariate linear regression多变量线性规划 (linear regression works with multiple variables or with multiple features) Multiple Features(variables)多特征(变量)

机器学习之多变量线性回归（Linear Regression with multiple variables）

1. Multiple features(多维特征) 在机器学习之单变量线性回归(Linear Regression with One Variable)我们提到过的线性回归中,我们只有一个单一特征量(变量)--房屋面积x.我们希望使用这个特征量来预测房子的价格.我们的假设在下图中用蓝线划出: 不妨思考一下,如果我们不仅仅知道房屋面积(作为预测房屋价格的特征量(变量)),我们还知道卧室的数量.楼层的数量以及房屋的使用年限,那么这就给了我们更多可以用来预测房屋价格的信息. 即,支持多变量的假设为:

斯坦福机器学习视频笔记 Week2 Linear Regression with Multiple Variables

相比于week1中讨论的单变量的线性回归,多元线性回归更具有一般性,应用范围也更大,更贴近实际. Multiple Features 上面就是接上次的例子,将房价预测问题进行扩充,添加多个特征(features),使问题变成多元线性回归问题. 多元线性回归将通过更多的输入特征,来预测输出.上面有新的Notation(标记)需要掌握. 相比于之前的假设: 我们将多元线性回归的假设修改为: 每一个xi代表一个特征:为了表达方便,令x0=1,可以得到假设的矩阵形式: 其中,x和theta分别表示: 所

【stanford 机器学习】学习笔记(2)--多变量线性回归(Linear Regression with Multiple Variables)

课程来自斯坦福大学吴恩达教授 machine learning: https://www.coursera.org/learn/machine-learning/home/welcome 多变量线性回归主要包括以下部分: 1) Multiple features(多维特征) 2) Gradient descent for multiple variables(梯度下降在多变量线性回归中的应用) 3) Gradient descent in practice I: Feature Scaling(

#Week3 Linear Regression with Multiple Variables

一.Multiple Features 这节课主要引入了一些记号,假设现在有n个特征,那么: 为了便于用矩阵处理,令$x_0=1$: 参数$\theta$是一个(n+1)*1维的向量,任一个训练样本也是(n+1)*1维的向量,故对于每个训练样本:$h_\theta(x)=\theta^Tx$. 二.Gradient Decent for Multiple Variables 类似地,定义代价函数: 同时更新参数直到$J$收敛: \[\theta_j:=\theta_j-\alph

机器学习笔记-1 Linear Regression with Multiple Variables(week 2)

1. Multiple Features note:X0 is equal to 1 2. Feature Scaling Idea: make sure features are on a similiar scale, approximately a -1<Xi<1 range For example: x1 = size (0-2000 feet^2) max-min or standard deviation x2 = number of bedrooms(1-5) The conto