But what exactly do we mean by "gets closer to"?

https://rdipietro.github.io/friendly-intro-to-cross-entropy-loss/

【将输入转化为输出:概率分布】

When we develop a model for probabilistic classification, we aim to map the model‘s inputs to probabilistic predictions, and we often train our model by incrementally adjusting the model‘s parameters so that our predictions get closer and closer to ground-truth probabilities.

In this post, we‘ll focus on models that assume that classes are mutually exclusive. For example, if we‘re interested in determining whether an image is best described as a landscape or as a house or as something else, then our model might accept an image as input and produce three numbers as output, each representing the probability of a single class.

During training, we might put in an image of a landscape, and we hope that our model produces predictions that are close to the ground-truth class probabilities y=(1.0,0.0,0.0)Ty=(1.0,0.0,0.0)T. If our model predicts a different distribution, say y^=(0.4,0.1,0.5)Ty^=(0.4,0.1,0.5)T, then we‘d like to nudge the parameters so that y^y^ gets closer to yy.

cross entropy 交叉熵 提供了一种量化的解决办法】

But what exactly do we mean by "gets closer to"? In particular, how should we measure the difference between y^y^ and yy?

This post describes one possible measure, cross entropy, and describes why it‘s reasonable for the task of classification.

https://rdipietro.github.io/friendly-intro-to-cross-entropy-loss/

zh.wikipedia.org/wiki/相对熵

KL散度是两个概率分布P和Q差别的非对称性的度量。 KL散度是用来 度量使用基于Q的编码来编码来自P的样本平均所需的额外的位元数。 典型情况下,P表示数据的真实分布,Q表示数据的理论分布,模型分布,或P的近似分布。

en.wikipedia.org/wiki/Kullback–Leibler_divergence

In the context of machine learningDKL(PQ) is often called the information gain achieved if P is used instead of Q. By analogy with information theory, it is also called the relative entropy of P with respect to Q. In the context of coding theoryDKL(PQ) can be constructed as measuring the expected number of extra bits required to codesamples from P using a code optimized for Q rather than the code optimized for P.

https://rdipietro.github.io/friendly-intro-to-cross-entropy-loss/

When we develop a probabilistic model over mutually exclusive classes, we need a way to measure the difference between predicted probabilities y^y^ and ground-truth probabilities yy, and during training we try to tune parameters so that this difference is minimized.

时间: 2024-10-03 17:00:45

But what exactly do we mean by "gets closer to"?的相关文章

ARM32 Linux kernel virtual address space

http://thinkiii.blogspot.jp/2014/02/arm32-linux-kernel-virtual-address-space.html The 32-bit ARM CPU can address up to 2^32 = 4GB address*. It's not big enough in present days, since the size of available DRAM on computing devices is growing fast and

Js中的bom

(1)window对象 弹块的方法 提示块: window.alert("提示信息"); Code: var res = alert("xxx"); alert(res); 确认块: window.confirm("确认信息"); 有返回值:如果点击确认返回true,如果点击取消返回false code: var res = confirm("您确认要删除吗?"); alert(res); 输入框:prompt("提

Apache Hadoop集群离线安装部署(一)——Hadoop(HDFS、YARN、MR)安装

虽然我已经装了个Cloudera的CDH集群(教程详见:http://www.cnblogs.com/pojishou/p/6267616.html),但实在太吃内存了,而且给定的组件版本是不可选的,如果只是为了研究研究技术,而且是单机,内存较小的情况下,还是建议安装Apache的原生的集群拿来玩,生产上自然是Cloudera的集群,除非有十分强大的运维. 我这次配了3台虚拟机节点.各给了4G,要是宿主机内存就8G的,可以搞3台2G,应该也是ok的. 〇.安装文件准备 Hadoop 2.7.3:

Learning docker--Chapter2--Handling Docker Containers

Content: (1)    Clarifying the Docker terms (2)    Working with the Docker images and containers (3)    The Docker Hub Registry (4)    Searching the Docker images (5)    Working with an interactive container (6)    Tracking the changes inside the con

Fulltext Index Study3:Query

在query 语句中,可以使用 contains predicate来调用Fulltext Index,实现比like速度更快的查询.使用contains能够进行term的extract匹配查询或term的前缀匹配查询,还能够进行基于词根的steming查询,基于自定义同义词文件的synonym查询,基于距离和顺序的相邻term查询.和like 相比,contains不能进行后缀匹配查询.如果Fulltext Index 能够满足业务需求,那么Fulltext Index是一个非常不错的选择,跟

poj 1113 凸包周长

Wall Time Limit: 1000MS   Memory Limit: 10000K Total Submissions: 33888   Accepted: 11544 Description Once upon a time there was a greedy King who ordered his chief Architect to build a wall around the King's castle. The King was so greedy, that he w

Hadoop之—— Linux搭建hadoop环境(简化篇)

转载请注明出处:http://blog.csdn.net/l1028386804/article/details/45771619 1.安装JDK(此处以安装JDK1.6为例,具体安装JDK的版本不限) (1)下载安装JDK:确保计算机联网之后命令行输入下面命令安装JDK      sudo apt-get install sun-java6-jdk (2)配置计算机Java环境:打开/etc/profile,在文件最后输入下面内容      export JAVA_HOME = (Java安装

HDOJ 1348 Wall 凸包

Wall Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 K (Java/Others) Total Submission(s): 4001    Accepted Submission(s): 1131 Problem Description Once upon a time there was a greedy King who ordered his chief Architect to build a

DICOM:开源DICOM服务框架DCM4CHE 构建

背景: 前一篇博文DICOM:开源DICOM服务框架DCM4CHE 安装中介绍了一款开源DICOM服务框架DCM4CHE,对于开源项目学习的流程是先下载二进制可执行包安装,然后使用测试.在熟悉了大致的功能服务后,从官网下载源代码进行本地构建(Build),进而从根本上了解开源项目的底层框架设计,为后续修复.扩展做准备.本博文是继DCM4CHE安装后的续篇,讲解如何在本地构建DCM4CHE开源项目,文中尽量做到全面,但是由于刚开始接触J2EE领域,且多半都是自学,因此博文中还留有部分未解问题,如有

hdu 1348 Wall(凸包模板题)

题目链接:http://acm.hdu.edu.cn/showproblem.php?pid=1348 Wall Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 K (Java/Others) Total Submission(s): 3386    Accepted Submission(s): 968 Problem Description Once upon a time there was a gre