knn in scala

  • nearest neighbor algorithm -- greedy

1开始的点A(不同则答案不同)

2选择cost最小的点D 重复

3最后回到A,加总

  • knn in scala --intuition
  • /** @author  wyq
     *  @version 1.0
     *  @date    Sun Sep 22 18:45:44 EDT 2013
     */
    package scalation.analytics
    
    import util.control.Breaks.{breakable, break}
    import collection.mutable.Set
    
    import scalation.linalgebra.{MatrixD, VectorD}
    import scalation.linalgebra_gen.VectorN
    import scalation.linalgebra_gen.Vectors.VectorI
    import scalation.math.DoubleWithExp._
    import scalation.util.Error
    
    /*
     *  @param x    the vectors/points of classified data stored as rows of a matrix (also can be in List[Array[Double]])
     *  @param y    the classification of each vector in x
     *  @param fn   the names for all features/variables
     *  @param k    the number of classes
     *  @param cn   the names for all classes
     *  @param knn  the number of nearest neighbors to consider
     */
    class KNN_Classifier (x: MatrixD, y: VectorI, fn: Array [String], k: Int, cn: Array [String],
                          knn: Int = 3)
          extends ClassifierReal (x, y, fn, k, cn)
    {
        private val DEBUG      = true                                        // degug flag
        private val MAX_DOUBLE = Double.PositiveInfinity                     // infinity
        private val topK       = Array.ofDim [Tuple2 [Int, Double]] (knn)    // top-knn nearest points (in reserve order)    ofDim
        private val count      = new VectorI (k)                             // how many nearest neighbors in each class.
    
        //:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
        /** Compute a distance metric between vectors/points u and v.
         *  @param u  the first vector/point
         *  @param v  the second vector/point
         *///always prepare the distance function
        def distance (u: VectorD, v: VectorD): Double =
        {
            (u - v).normSq
        } 
    
        //:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
        /** Find the knn nearest neighbors (top-knn) to vector ‘z‘.
         *  @param z  the vector to be classified
         */
        def kNearest (z: VectorD)
        {
            var dk = MAX_DOUBLE
            for (i <- 0 until x.dim1) {
                val di = distance (z, x(i))                   // compute distance to z
                if (di < dk) dk = replace (i, di)             // if closer, adjust top-knn
            }
            if (DEBUG) println ("topK = " + topK.deep)
        } 
    
        //:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
        /** Remove the most distant neighbor and add new neighbor ‘i‘.  Maintain the
         *  ‘topK‘ nearest neighbors in sorted order farthest to nearest.
         */
        def replace (i: Int, di: Double): Double =
        {
            var j = 0
            while (j < knn-1 && di < topK(j)._2) { topK(j) = topK(j+1); j += 1 }
            topK(j) = (i, di)
            topK(0)._2                      // the distance of the new farthest neighbor
        } // replace
    
        //:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
        /** Training involves resetting the data structures before each classification.
         *  KNN uses lazy training, so most of it is done during classification.
         */
        def train ()
        {
            for (i <- 0 until knn) topK(i)  = (-1, MAX_DOUBLE)   // intialize top-knn
            for (j <- 0 until k) count(j) = 0                    // initilize counters
        } // train
    
        //:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
        /** Given a new point/vector ‘z‘, determine which class it belongs to (i.e.,
         *  the class getting the most votes from its ‘knn‘ nearest neighbors.
         *  @param z  the vector to classify
         */
        def classify (z: VectorD): Tuple2 [Int, String] =
        {
            kNearest (z)                                         // set top-knn to knn nearest
            for (i <- 0 until knn) count(y(topK(i)._1)) += 1     // tally per class
            println ("count = " + count)
            val best = count.argmax ()                           // class with maximal count
            (best, cn(best))                                     // return the best class and its name
        } // classify
    
    } // KNN_Classifier class
    
    //:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
    /** The `KNN_ClassifierTest` object is used to test the `KNN_Classifier` class.
     */
    object KNN_ClassifierTest extends App
    {
        val x = new MatrixD ((6, 2), 1.0, 2.0,      // data/feature matrix
                                     2.0, 1.0,
                                     5.0, 4.0,
                                     4.0, 5.0,
                                     9.0, 8.0,
                                     8.0, 9.0)
        val y  = VectorN (0, 0, 0, 1, 1, 1)         // classification for each vector in x
        val fn = Array ("x1", "x2")                 // feature/variable names
        val cn = Array ("No", "Yes")                // class names
    
        println ("----------------------------------------------------")
        println ("x = " + x)
        println ("y = " + y)
        val cl = new KNN_Classifier (x, y, fn, 2, cn)
    
        cl.train ()
        val z1 = VectorD (10.0, 10.0)
        println ("----------------------------------------------------")
        println ("z1 = " + z1)
        println ("class = " + cl.classify (z1))
    
        cl.train ()
        val z2 = VectorD ( 3.0,  3.0)
        println ("----------------------------------------------------")
        println ("z2 = " + z2)
        println ("class = " + cl.classify (z2))
    
    } // KNN_ClassifierTest object
    
    
    
时间: 2024-11-06 09:31:43

knn in scala的相关文章

【机器学习实战】第2章 K-近邻算法(k-NearestNeighbor,KNN)

第2章 k-近邻算法 <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=default"></script> KNN 概述 k-近邻(kNN, k-NearestNeighbor)算法主要是用来进行分类的. KNN 场景 电影可以按照题材分类,那么如何区分 动作片 和 爱情片 呢? 动作片:打斗次数更多 爱情片

Scala Study --- override

以前没使用过Scala, 其实我Java也是半截水平\无奈, 学Java的时候刚从C++中挣脱出来,发现Java无比优雅,但很快又对Java种种不信任程序员的设计感到受限. 直到, , 今天遇到了Scala\撒花 Scala的collection设计不能更赞!一段时间后打算专门写篇文章总结Scala,名字就叫"我为什么喜欢Scala!". 废话就不多说了,今天研究了一下Scala的override用法与特点. override --- one of the key words of S

Scala 中apply方法的用法

Scala 是构建在 JVM 上的静态类型的脚本语言,而脚本语言总是会有些约定来增强灵活性.关于协议在Python中是挺多的,看看Python的对象协议,有很多很多,如果对Python的对象协议了解(不了解的可以点击此处)的比较深刻的话,其实scala的apply方法也是很好理解的,比如说 Scala 为配合 DSL 在方法调用时有这么一条约定: 在明确了方法调用的接收者的情况下,若方法只有一个参数时,调用的时候就可以省略点及括号.如 "0 to 2",实际完整调用是 "0.

【Scala】Scala之Numbers

一.前言 前面已经学习了Scala中的String,接着学习Scala的Numbers. 二.Numbers 在Scala中,所有的数字类型,如Byte,Char,Double,Float,Int,Long,Short都是对象,这七种数字类型继承AnyVal特质,这七种数字类型与其在Java中有相同的范围,而Unit和Boolean则被认为是非数字值类型,Boolean有false和true两个值,你可以获取到各个数字类型的最值. 复杂的数字和日期 如果需要更强大的数类,可以使用spire,sc

scala控制结构

#判断 scala> def min(x:Int,y:Int):Int={ var a=x if(x>y) a=y return a } scala> min(1,2)res1: Int = 1 #循环    ##引申:函数式编程里面尽量使用常量,所以尽量避免 while do? 变量? while (A) B do B while A scala> var m=3scala> while (m!=0){ println(m) m-=1 } 321 #枚举 for (i<

Machine Learning In Action 第二章学习笔记: kNN算法

本文主要记录<Machine Learning In Action>中第二章的内容.书中以两个具体实例来介绍kNN(k nearest neighbors),分别是: 约会对象预测 手写数字识别 通过“约会对象”功能,基本能够了解到kNN算法的工作原理.“手写数字识别”与“约会对象预测”使用完全一样的算法代码,仅仅是数据集有变化. 约会对象预测 1 约会对象预测功能需求 主人公“张三”喜欢结交新朋友.“系统A”上面注册了很多类似于“张三”的用户,大家都想结交心朋友.“张三”最开始通过自己筛选的

scala学习手记19 - Option类型

看到Option类型就知道这本教材应该要说那个了. 使用过guava后,应该知道guava中的Optional类的作用是什么.算了找下原始文档好了: Optional<T> is a way of replacing a nullable T reference with a non-null value. An Optional may either contain a non-null T reference (in which case we say the reference is &

scala学习手记13 - 类继承

在scala里,类继承有两点限制: 重写方法需要使用override关键字: 只有主构造函数才能往父类构造函数中传参数. 在java1.5中引入了override注解,但不强制使用.不过在scala中要想重写方法必须使用override关键字.如果确实重写了父类的方法又不使用override关键字的话,则会在编译时报错,提示没有使用override修饰符. scala的副构造函数必须调用主构造函数或是另一个副构造函数.只有在主构造函数中才能向父类的构造函数中传递数据.可以看出来主构造函数如同父类

scala学习手记10 - 访问修饰符

scala的访问修饰符有如下几个特性: 如果不指定访问修饰符,scala默认为public: 较之Java,scala对protected的定义更加严格: scala可以对可见性进行细粒度的控制. scala的默认访问修饰符 如果没有修饰符,scala会默认把类.字段.方法的访问修饰符当做public.如果要将之调整为private或protected,只需在前面添加对应的修饰符关键字即可.就如下面的程序: class Microwave{ def start() = println("star