PLA Percentron Learning Algorithm #台大 Machine learning #

                 Percentron  Learning Algorithm


于垃圾邮件的鉴别

这里肯定会预先给定一个关于垃圾邮件词汇的集合(keyword set),然后根据四组不通过的输入样本里面垃圾词汇出现的频率来鉴别是否是垃圾邮件.系统输出+1判定为垃圾邮件,否则不是.这里答案是第二组.

拿二维数据来做例子.我们要选取一条线来划分红色的叉叉,和蓝色的圈圈样本点(线性划分).怎么做呢?这里的困难之处就在于,其实可行的解可能存在无数条直线可以划分这些样本点.很难全部求解,或许实际生活中并不需要全部求解.于是,我们可以先随意的去初始一个假设解,然后不断的去修正这个一开始可能不是正确的解.使之越来越接近正确可行的一个解.

我们一开始把w权重系数置为0,然后去修正这个权重系数.怎么修正呢?

PLA的重要性质:有错才更新!

IF

  如果实际输出wTt?xn(t) 和预测输出不一致的话,

将新的权重设置为wt+1=wt+yn(t)?xn(t).

即,如果输出y(n)是个正数,期望输出y和x向量方向一致,试图减小夹角.如果输出y(n)是个负数,期望输出y和x向量方向恰好相反,试图减增大夹角.

ELSE

   这里如果实际输出和期望输出无差异,则不需要进行修正了.

关于线性可划分性的探讨:

其实关于这一步夹角越来越小的数学推到我也不是很清楚.

Funtime

下面是PLA的一个demo代码:

这个demo 来自link,之前的源码在我的配置环境下不能跑,我有稍作改动.

https://datasciencelab.wordpress.com/2014/01/10/machine-learning-classics-the-perceptron/

"""
Programmer  :   EOF
file        :   pla.py
date        :   2015.02.22

Code description:
    This program is coded for Perceptron Learning Algorithm.

"""
import numpy as np
import matplotlib.pyplot as plt
import random
import os, subprocess

class Perceptron:
    def __init__(self, N):
        # random linearly seperated data
        xA, yA, xB, yB = [random.uniform(-1, 1) for i in range(4)]
        self.V = np.array([xB*yA - xA*yB, yB - yA, xA - xB])
        self.X = self.generate_points(N)

    def generate_points(self, N):
        X = []
        for i in range(N):
            x1, x2 = [random.uniform(-1, 1) for i in range(2)]
            x = np.array([1, x1, x2])
            s = int(np.sign(self.V.T.dot(x)))
            X.append((x, s))

        return X

    def plot(self, mispts = None, vec = None, save = False):
        fig = plt.figure(figsize=(5,5))
        plt.xlim(-1, 1)
        plt.ylim(-1, 1)
        V = self.V
        a, b = -V[1]/V[2], -V[0]/V[2]
        l = np.linspace(-1, 1)
        plt.plot(l, a*l + b, ‘k-‘)
        cols = {1: ‘r‘, -1: ‘b‘}

        for x,s in self.X:
            plt.plot(x[1], x[2], cols[s] + ‘o‘)

        if mispts:
            for x, s in mispts:
                plt.plot(x[1], x[2], cols[s] + ‘.‘)

        if vec != None:
            aa, bb = -vec[1]/vec[2], -vec[0]/vec[2]
            plt.plot(l, aa*l + bb, ‘g-‘, lw = 2)

        if save:
            if not mispts:
                plt.title(‘N = %s‘ % (str(len(self.X))))
            else:
                plt.title(‘N = %s with % test points‘                             % (str(len(self.X)), str(len(mispts))))

            plt.savefig(‘p_N %s ‘ % (str(len(self.X))),                         dpi = 200, bbox_inches = ‘tight‘)

        plt.show()

    def classification_error(self, vec, pts = None):
        # Error defined as fraction of misclassified points
        if not pts:
            pts = self.X

        M = len(pts)
        n_mispts = 0
        for x, s in pts:
            if int(np.sign(vec.T.dot(x))) != s :
                n_mispts += 1

        error = n_mispts / float(M)
        return error

    def choose_miscl_point(self, vec):
        # Choose a random point among the misclassified
        pts = self.X
        mispts = []
        for x, s in pts:
            if int(np.sign(vec.T.dot(x))) !=s :
                mispts.append((x, s))

        return mispts[random.randrange(0, len(mispts))]

    def pla(self, save = False):
        # Initialize the weights to zeros
        w = np.zeros(3)
        X, N = self.X, len(self.X)
        it = 0
        # Iterate until all points are correctly classified
        while self.classification_error(w) != 0:
            it += 1
            # pick random misclassified point
            x, s = self.choose_miscl_point(w)
            # update weights
            w += s*x
            if save:
                self.plot(vec = w)
                plt.title(‘N = %s, Iteration %s\n‘                             % (str(N), str(it)))
                plt.savefig(‘p_N % s_it %s‘ % (str(N), str(it)),                             dpi = 200, bbox_inches = ‘tight‘)

        self.w = w

    def check_error(self, M, vec):
        check_pts = self.generate_points(M)
        return self.classification_error(vec, pts = check_pts)

#--------for testing-------------------------
p = Perceptron(20)
#p.plot(p.generate_points(20),p.w, save=True)
p.plot()
时间: 2024-11-08 01:49:59

PLA Percentron Learning Algorithm #台大 Machine learning #的相关文章

Machine Learning - XVII. Large Scale Machine Learning大规模机器学习 (Week 10)

http://blog.csdn.net/pipisorry/article/details/44904649 机器学习Machine Learning - Andrew NG courses学习笔记 Large Scale Machine Learning大规模机器学习 Learning With Large Datasets大数据集学习 Stochastic Gradient Descent随机梯度下降 Mini-Batch Gradient Descent迷你批处理梯度下降 Stochas

微软机器学习Azure Machine Learning入门概览

Azure Machine Learning(简称“AML”)是微软在其公有云Azure上推出的基于Web使用的一项机器学习服务,机器学习属人工智能的一个分支,它技术借助算法让电脑对大量流动数据集进行识别.这种方式能够通过历史数据来预测未来事件和行为,其实现方式明显优于传统的商业智能形式.微软的目标是简化使用机器学习的过程,以便于开发人员.业务分析师和数据科学家进行广泛.便捷地应用.这款服务的目的在于“将机器学习动力与云计算的简单性相结合”.AML目前在微软的Global Azure云服务平台提

SOME USEFUL MACHINE LEARNING LIBRARIES.

from: http://www.erogol.com/broad-view-machine-learning-libraries/ http://www.slideshare.net/VincenzoLomonaco/deep-learning-libraries-and-rst-experiments-with-theano FEBRUARY 6, 2014 EREN 1 COMMENT Especially, with the advent of many different and in

Awesome Machine Learning

Awesome Machine Learning  A curated list of awesome machine learning frameworks, libraries and software (by language). Inspired by awesome-php. If you want to contribute to this list (please do), send me a pull request or contact me @josephmisiti Als

机器学习算法之旅A Tour of Machine Learning Algorithms

In this post we take a tour of the most popular machine learning algorithms. It is useful to tour the main algorithms in the field to get a feeling of what methods are available. There are so many algorithms available and it can feel overwhelming whe

A Gentle Guide to Machine Learning

A Gentle Guide to Machine Learning Machine Learning is a subfield within Artificial Intelligence that builds algorithms that allow computers to learn to perform tasks from data instead of being explicitly programmed. Got it? We can make machines lear

What is machine learning?

What is machine learning? One area of technology that is helping improve the services that we use on our smartphones, and on the web, is machine learning. Sometimes, the terms machine learning and artificial intelligence get used as synonyms, especia

No, Machine Learning is not just glorified Statistics

This meme has been all over social media lately, producing appreciative chuckles across the internet as the hype around deep learning begins to subside. The sentiment that machine learning is really nothing to get excited about, or that it's just a r

[C5] Andrew Ng - Structuring Machine Learning Projects

About this Course You will learn how to build a successful machine learning project. If you aspire to be a technical leader in AI, and know how to set direction for your team's work, this course will show you how. Much of this content has never been