coursera:machine learing--code-6

Anomaly detection

In this exercise, you will implement an anomaly detection algorithm to detect anomalous behavior in server computers. The features measure the throughput (mb/s) and latency (ms) of response of each server. While your servers were operating, you collected m = 307 examples of how they were behaving, and thus have an unlabeled dataset {x(1),...,x(m)}. You suspect that the vast majority of these examples are “normal” (non-anomalous) examples of the servers operating normally, but there might also be some examples of servers acting anomalously within this dataset.

You will use a Gaussian model to detect anomalous examples in your dataset. You will ?rst start on a 2D dataset that will allow you to visualize what the algorithm is doing. On that dataset you will ?t a Gaussian distribution and then ?nd values that have very low probability and hence can be considered anomalies. After that, you will apply the anomaly detection algorithm to a larger dataset with many dimensions. You will be using ex8.m for this part of the exercise.

Gaussian distribution

To perform anomaly detection, you will ?rst need to ?t a model to the data’s distribution.

Estimating parameters for a Gaussian

Your task is to complete the code in estimateGaussian.m. This function takes as input the data matrix X and should output an n-dimension vector mu that holds the mean of all the n features and another n-dimension vector sigma2 that holds the variances of all the features. You can implement this using a for-loop over every feature and every training example (though a vectorized implementation might be more e?cient; feel free to use a vectorized implementation if you prefer). Note that in Octave/MATLAB, the var function will (by default) use  m?1 , instead of  m, when computing σ2.

From your plot, you can see that most of the examples are in the region with the highest probability, while the anomalous examples are in the regions with lower probabilities.

function [mu sigma2] = estimateGaussian(X)

[m, n] = size(X);

mu = zeros(n, 1);
sigma2 = zeros(n, 1);

mu = (mean(X))‘;
sigma2 = (var(X))‘*(m-1)/m;

end
%  Estimate my and sigma2
[mu sigma2] = estimateGaussian(X);

%  Returns the density of the multivariate normal at each data point (row)
%  of X
p = multivariateGaussian(X, mu, sigma2);

  

Now that you have estimated the Gaussian parameters, you can investigate which examples have a very high probability given this distribution and which examples have a very low probability. The low probability examples are more likely to be the anomalies in our dataset. One way to determine which examples are anomalies is to select a threshold based on a cross validation set. In this part of the exercise, you will implement an algorithm to select the threshold ε using the F1 score on a cross validation set.

In the provided code selectThreshold.m, there is already a loop that will try many di?erent values of ε and select the best ε based on the F1 score.

function [bestEpsilon bestF1] = selectThreshold(yval, pval)

bestEpsilon = 0;
bestF1 = 0;
F1 = 0;

stepsize = (max(pval) - min(pval)) / 1000;
for epsilon = min(pval):stepsize:max(pval)

cvPredictions = pval<epsilon;
tp = sum((cvPredictions == 1)&(yval == 1)); %true positives
fp = sum((cvPredictions == 1)&(yval == 0));
fn = sum((cvPredictions == 0)&(yval == 1));

prec = tp/(tp+fp+1e-10);
rec = tp/(tp+fn+1e-10);
F1 = 2*prec*rec/(prec+rec+1e-10);

    if F1 > bestF1
       bestF1 = F1;
       bestEpsilon = epsilon;
    end
end

end
%  Loads the second dataset. You should now have the
%  variables X, Xval, yval in your environment
load(‘ex8data2.mat‘);

%  Apply the same steps to the larger dataset
[mu sigma2] = estimateGaussian(X);

%  Training set
p = multivariateGaussian(X, mu, sigma2);

%  Cross-validation set
pval = multivariateGaussian(Xval, mu, sigma2);

%  Find the best threshold
[epsilon F1] = selectThreshold(yval, pval);

fprintf(‘Best epsilon found using cross-validation: %e\n‘, epsilon);
fprintf(‘Best F1 on Cross Validation Set:  %f\n‘, F1);
fprintf(‘# Outliers found: %d\n‘, sum(p < epsilon));

Recommender Systems

In this part of the exercise, you will implement the collaborative ?ltering learning algorithm and apply it to a dataset of movie ratings.2 This dataset consists of ratings on a scale of 1 to 5. The dataset has nu = 943 users, and nm = 1682 movies.

In the next parts of this exercise, you will implement the function cofiCostFunc.m that computes the collaborative ?tlering objective function and gradient. After implementing the cost function and gradient, you will use fmincg.m to learn the parameters for collaborative ?ltering.

Movie ratings dataset

The ?rst part of the script ex8 cofi.m will load the dataset ex8 movies.mat, providing the variables Y and R in your Octave/MATLAB environment.

The matrix Y (a num movies×num users matrix) stores the ratings y(i,j) (from 1 to 5). The matrix R is an binary-valued indicator matrix, where R(i,j) = 1 if user j gave a rating to movie i, and R(i,j) = 0 otherwise. The objective of collaborative ?ltering is to predict movie ratings for the movies that users have not yet rated, that is, the entries with R(i,j) = 0. This will allow us to recommend the movies with the highest predicted ratings to the user.

To help you understand the matrix Y, the script ex8 cofi.m will compute the average movie rating for the ?rst movie (Toy Story) and output the average rating to the screen.

Throughout this part of the exercise, you will also be working with the matrices, X and Theta:

Collaborative ?ltering learning algorith

Now, you will start implementing the collaborative ?ltering learning algorithm. You will start by implementing the cost function (without regularization).

Collaborative ?ltering cost function

Collaborative ?ltering gradient

(one for looping over movies to compute ?J/?x(i) k for each movie, and one for looping over users to compute ?J/?θ(j) k for each user).

Intuitively, when you consider the features for the i-th movie, you only need to be concern about the users who had given ratings to the movie, and this allows you to remove all the other users from Theta and Y.

Regularized cost function

The cost function for collaborative ?ltering with regularization is given by

function [J, grad] = cofiCostFunc(params, Y, R, num_users, num_movies, ...
                                  num_features, lambda)
%COFICOSTFUNC Collaborative filtering cost function
%   [J, grad] = COFICOSTFUNC(params, Y, R, num_users, num_movies, ...
%   num_features, lambda) returns the cost and gradient for the
%   collaborative filtering problem.
%

% Unfold the U and W matrices from params
X = reshape(params(1:num_movies*num_features), num_movies, num_features);
Theta = reshape(params(num_movies*num_features+1:end), ...
                num_users, num_features);

% You need to return the following values correctly
J = 0;
X_grad = zeros(size(X));
Theta_grad = zeros(size(Theta));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost function and gradient for collaborative
%               filtering. Concretely, you should first implement the cost
%               function (without regularization) and make sure it is
%               matches our costs. After that, you should implement the
%               gradient and use the checkCostFunction routine to check
%               that the gradient is correct. Finally, you should implement
%               regularization.
%
% Notes: X - num_movies  x num_features matrix of movie features
%        Theta - num_users  x num_features matrix of user features
%        Y - num_movies x num_users matrix of user ratings of movies
%        R - num_movies x num_users matrix, where R(i, j) = 1 if the
%            i-th movie was rated by the j-th user
%
% You should set the following variables correctly:
%
%        X_grad - num_movies x num_features matrix, containing the
%                 partial derivatives w.r.t. to each element of X
%        Theta_grad - num_users x num_features matrix, containing the
%                     partial derivatives w.r.t. to each element of Theta
%

H =X*Theta‘;
J = 0.5*sum(sum(R.*(H-Y).^2))+0.5*lambda*sum(sum(Theta.^2))+0.5*lambda*sum(sum(X.^2));

X_grad = R.*(H-Y)*Theta + lambda*X;
Theta_grad = (R.*(H-Y))‘*X + lambda*Theta;

% =============================================================

grad = [X_grad(:); Theta_grad(:)];

end
%% Machine Learning Online Class
%  Exercise 8 | Anomaly Detection and Collaborative Filtering
%
%  Instructions
%  ------------
%
%  This file contains code that helps you get started on the
%  exercise. You will need to complete the following functions:
%
%     estimateGaussian.m
%     selectThreshold.m
%     cofiCostFunc.m
%
%  For this exercise, you will not need to change any code in this file,
%  or any other files other than those mentioned above.
%

%% =============== Part 1: Loading movie ratings dataset ================
%  You will start by loading the movie ratings dataset to understand the
%  structure of the data.
%
fprintf(‘Loading movie ratings dataset.\n\n‘);

%  Load data
load (‘ex8_movies.mat‘);

%  Y is a 1682x943 matrix, containing ratings (1-5) of 1682 movies on
%  943 users
%
%  R is a 1682x943 matrix, where R(i,j) = 1 if and only if user j gave a
%  rating to movie i

%  From the matrix, we can compute statistics like average rating.
fprintf(‘Average rating for movie 1 (Toy Story): %f / 5\n\n‘, ...
        mean(Y(1, R(1, :))));

%  We can "visualize" the ratings matrix by plotting it with imagesc
imagesc(Y);
ylabel(‘Movies‘);
xlabel(‘Users‘);

fprintf(‘\nProgram paused. Press enter to continue.\n‘);
pause;

%% ============ Part 2: Collaborative Filtering Cost Function ===========
%  You will now implement the cost function for collaborative filtering.
%  To help you debug your cost function, we have included set of weights
%  that we trained on that. Specifically, you should complete the code in
%  cofiCostFunc.m to return J.

%  Load pre-trained weights (X, Theta, num_users, num_movies, num_features)
load (‘ex8_movieParams.mat‘);

%  Reduce the data set size so that this runs faster
num_users = 4; num_movies = 5; num_features = 3;
X = X(1:num_movies, 1:num_features);
Theta = Theta(1:num_users, 1:num_features);
Y = Y(1:num_movies, 1:num_users);
R = R(1:num_movies, 1:num_users);

%  Evaluate cost function
J = cofiCostFunc([X(:) ; Theta(:)], Y, R, num_users, num_movies, ...
               num_features, 0);

fprintf([‘Cost at loaded parameters: %f ‘...
         ‘\n(this value should be about 22.22)\n‘], J);

fprintf(‘\nProgram paused. Press enter to continue.\n‘);
pause;

%% ============== Part 3: Collaborative Filtering Gradient ==============
%  Once your cost function matches up with ours, you should now implement
%  the collaborative filtering gradient function. Specifically, you should
%  complete the code in cofiCostFunc.m to return the grad argument.
%
fprintf(‘\nChecking Gradients (without regularization) ... \n‘);

%  Check gradients by running checkNNGradients
checkCostFunction;

fprintf(‘\nProgram paused. Press enter to continue.\n‘);
pause;

%% ========= Part 4: Collaborative Filtering Cost Regularization ========
%  Now, you should implement regularization for the cost function for
%  collaborative filtering. You can implement it by adding the cost of
%  regularization to the original cost computation.
%  

%  Evaluate cost function
J = cofiCostFunc([X(:) ; Theta(:)], Y, R, num_users, num_movies, ...
               num_features, 1.5);

fprintf([‘Cost at loaded parameters (lambda = 1.5): %f ‘...
         ‘\n(this value should be about 31.34)\n‘], J);

fprintf(‘\nProgram paused. Press enter to continue.\n‘);
pause;

%% ======= Part 5: Collaborative Filtering Gradient Regularization ======
%  Once your cost matches up with ours, you should proceed to implement
%  regularization for the gradient.
%

%
fprintf(‘\nChecking Gradients (with regularization) ... \n‘);

%  Check gradients by running checkNNGradients
checkCostFunction(1.5);

fprintf(‘\nProgram paused. Press enter to continue.\n‘);
pause;

%% ============== Part 6: Entering ratings for a new user ===============
%  Before we will train the collaborative filtering model, we will first
%  add ratings that correspond to a new user that we just observed. This
%  part of the code will also allow you to put in your own ratings for the
%  movies in our dataset!
%
movieList = loadMovieList();

%  Initialize my ratings
my_ratings = zeros(1682, 1);

% Check the file movie_idx.txt for id of each movie in our dataset
% For example, Toy Story (1995) has ID 1, so to rate it "4", you can set
my_ratings(1) = 4;

% Or suppose did not enjoy Silence of the Lambs (1991), you can set
my_ratings(98) = 2;

% We have selected a few movies we liked / did not like and the ratings we
% gave are as follows:
my_ratings(7) = 3;
my_ratings(12)= 5;
my_ratings(54) = 4;
my_ratings(64)= 5;
my_ratings(66)= 3;
my_ratings(69) = 5;
my_ratings(183) = 4;
my_ratings(226) = 5;
my_ratings(355)= 5;

fprintf(‘\n\nNew user ratings:\n‘);
for i = 1:length(my_ratings)
    if my_ratings(i) > 0
        fprintf(‘Rated %d for %s\n‘, my_ratings(i), ...
                 movieList{i});
    end
end

fprintf(‘\nProgram paused. Press enter to continue.\n‘);
pause;

%% ================== Part 7: Learning Movie Ratings ====================
%  Now, you will train the collaborative filtering model on a movie rating
%  dataset of 1682 movies and 943 users
%

fprintf(‘\nTraining collaborative filtering...\n‘);

%  Load data
load(‘ex8_movies.mat‘);

%  Y is a 1682x943 matrix, containing ratings (1-5) of 1682 movies by
%  943 users
%
%  R is a 1682x943 matrix, where R(i,j) = 1 if and only if user j gave a
%  rating to movie i

%  Add our own ratings to the data matrix
Y = [my_ratings Y];
R = [(my_ratings ~= 0) R];

%  Normalize Ratings
[Ynorm, Ymean] = normalizeRatings(Y, R);

%  Useful Values
num_users = size(Y, 2);
num_movies = size(Y, 1);
num_features = 10;

% Set Initial Parameters (Theta, X)
X = randn(num_movies, num_features);
Theta = randn(num_users, num_features);

initial_parameters = [X(:); Theta(:)];

% Set options for fmincg
options = optimset(‘GradObj‘, ‘on‘, ‘MaxIter‘, 100);

% Set Regularization
lambda = 10;
theta = fmincg (@(t)(cofiCostFunc(t, Ynorm, R, num_users, num_movies, ...
                                num_features, lambda)), ...
                initial_parameters, options);

% Unfold the returned theta back into U and W
X = reshape(theta(1:num_movies*num_features), num_movies, num_features);
Theta = reshape(theta(num_movies*num_features+1:end), ...
                num_users, num_features);

fprintf(‘Recommender system learning completed.\n‘);

fprintf(‘\nProgram paused. Press enter to continue.\n‘);
pause;

%% ================== Part 8: Recommendation for you ====================
%  After training the model, you can now make recommendations by computing
%  the predictions matrix.
%

p = X * Theta‘;
my_predictions = p(:,1) + Ymean;

movieList = loadMovieList();

[r, ix] = sort(my_predictions, ‘descend‘);
fprintf(‘\nTop recommendations for you:\n‘);
for i=1:10
    j = ix(i);
    fprintf(‘Predicting rating %.1f for movie %s\n‘, my_predictions(j), ...
            movieList{j});
end

fprintf(‘\n\nOriginal ratings provided:\n‘);
for i = 1:length(my_ratings)
    if my_ratings(i) > 0
        fprintf(‘Rated %d for %s\n‘, my_ratings(i), ...
                 movieList{i});
    end
end
时间: 2024-10-14 15:15:03

coursera:machine learing--code-6的相关文章

Coursera Machine Learning 学习笔记(一)

之前就对Machine Learning很感兴趣,假期得闲看了Coursera Machine Learning 的全部课程,整理了笔记以便反复体会. I. Introduction (Week 1) - What is machine learning 对于机器学习的定义,并没有一个被一致认同的答案. Arthur Samuel (1959) 给出对机器学习的定义: 机器学习所研究的是赋予计算机在没有明确编程的情况下仍能学习的能力. Samuel设计了一个西洋棋游戏,他让程序自己跟自己下棋,并

Coursera machine learning 第二周 quiz 答案 Octave/Matlab Tutorial

https://www.coursera.org/learn/machine-learning/exam/dbM1J/octave-matlab-tutorial Octave Tutorial 5 试题 1. Suppose I first execute the following Octave commands: A = [1 2; 3 4; 5 6]; B = [1 2 3; 4 5 6]; Which of the following are then valid Octave com

azure Machine learing studio 使用示例之 - 使用线性回归算法完成预测评估

本文演示如何使用azure studio基于云测试数据创建第一个machine learning的experiment,算法选择是线性回归.首先要有一个azure studio账号,登录后进入dashboard. 创建一个BLANK的Experiment 添加测试数据 , 搜索Price, 选择Automibile price data(Raw) 把这个模块拖到右边的data item位置 搜索Project column模块,拖到右边,在Project columns中点击 'Launch c

Coursera machine learning 第二周 编程作业 Linear Regression

必做: [*] warmUpExercise.m - Simple example function in Octave/MATLAB[*] plotData.m - Function to display the dataset[*] computeCost.m - Function to compute the cost of linear regression[*] gradientDescent.m - Function to run gradient descent 1.warmUpE

Coursera - Machine Learning, Stanford: Week 1

Welcome and introduction Overview Reading Log 9/9 videos and quiz completed; 10/29 Review; Note 1.1 Welcome 1) What is machine learning? Machine learning is the science of getting compters to learn, without being explicitly programmed. 1.2 Introducti

【Coursera - machine learning】 Linear regression with one variable-quiz

Question 1 Consider the problem of predicting how well a student does in her second year of college/university, given how well they did in their first year. Specifically, let x be equal to the number of "A" grades (including A-. A and A+ grades)

coursera:machine learing--code-3

Informally, the C parameter is a positive value that controls the penalty for misclassi?ed training examples. A large C parameter tells the SVM to try to classify all the examples correctly. C plays a role similar to 1 , where is the regularization p

【投稿】Machine Learing With Spark Note 3:构建分类器

本文为数盟特约作者投稿,欢迎转载,请注明出处"数盟社区"和作者 博主简介:段石石,1号店精准化推荐算法工程师,主要负责1号店用户画像构建,喜欢钻研点Machine Learning的黑科技,对Deep Learning感兴趣,喜欢玩kaggle.看9神,对数据和Machine Learning有兴趣咱们可以一起聊聊,个人博客: hacker.duanshishi.com Spark构建分类器 在本章中,我们会了解基本的分类器以及在Spark如何使用,以及一套如何对model进行评价.调

Coursera Machine Learning 学习笔记(十三)

VI. Logistic Regression (Week 3) - Classification 在分类问题中,我们所尝试预测的是结果是否属于某一类(例如正确或错误).分类问题的例子有:判断一封电子邮件是否是垃圾邮件:判断一封电子邮件是否是垃圾邮件:判断一次金融交易是否是欺诈等等. 我们从二元的分类问题开始讨论. 我们将因变量(dependent variable)可能属于的两个类分别称为负向类(negative class)和正向类(positive class),则因变量 - Hypoth