CBOW Formula Deduction

In our setting, the vocabulary size is $V$, and the hidden layer size is $N$.

The input is a one-hot representation vector, which means for a given input context word, only one out of $V$ units, $\{x_1,\cdots,x_V\}$, will be 1, and all other units are 0.

The weight between the input layer and the output layer can be represented by a $V \times N$ matrix $W$. Each row of $W$ is the $N$-dimension vector representation $v_w$ of the associated word of the input layer.

Given a context (a word), assuming $x_k=1$ and $x_{k’}=0$ for $k’\neq k$ then

\[h=x^TW=W{(k,\cdot):=v_{w_I}}\]

which is just copying the $k$-throw of $W$ to $h$. $v_{w_I}$ is the vector representation of the input word $w_I$. This implies that the link (activation) function of the hidden layer units is simply linear (i.e., directly passing its weighted sum of inputs to the next layer).

From the hidden layer to the output layer, there is a different weight matrix $W’=\{w’_{ij}\}$, which is a $N \times V$ matrix. Using these weights, we can compute a score $u_j$ for each word in the vocabulary,

\[ u_j={v’_{w_j}}^T \cdot h \]

where $v’_{w_j}$ is the $j$-th column of the matrix $W’$. Then we can use the softmax classification model to obtain the posterior distribution of the words, which is a multinomial distribution.

\[p(w_j|w_I)=y_j=\frac{\exp(u_j)}{\sum_{j’=1}^V{\exp(u_{j’})}}\]

where $y_j$ is the output of the $j$-th unit in the output layer.

Finally, we obtain:

\[p(w_j | w_I) = y_j = \frac{\exp( {v’_{w_o}}^T v_{w_I})}{\sum_{j’=1}^V{\exp( {v’_{w’_j}}^T v_{w_I})}}\]

时间： 2024-10-24 10:48:41

CBOW Formula Deduction的相关文章

redmine computed custom field formula tips

项目中要用到Computed custom field插件,公式不知道怎么写,查了些资料,记录在这里. 1.http://apidock.com/ruby/Time/strftime 查看ruby的字符串格式,用于改写Date/time format只显示日期,不显示时间. 2.https://github.com/annikoff/redmine_plugin_computed_custom_field/issues/34 看到formula里可以写复杂代码,比如增加变量.指定返回值等. 3.

hdu 5139 Formula（离线处理）

Formula Time Limit: 4000/2000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others) Total Submission(s): 1200 Accepted Submission(s): 415 Problem Description You are expected to write a program to calculate f(n) when a certain n is given.

Bestcoder #21&&hdoj 5139 Formula 【另类打表之分块】

Formula Time Limit: 4000/2000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others) Total Submission(s): 155 Accepted Submission(s): 69 Problem Description f(n)=(∏i=1nin?i+1)%1000000007 You are expected to write a program to calculate f(n)

Calculate the formula

Problem Description You just need to calculate the sum of the formula: 1^2+3^2+5^2+……+ n ^2. Input In each case, there is an odd positive integer n. Output Print the sum. Make sure the sum will not exceed 2^31-1 Sample Input 3 Sample Output 10 用普通的做法

HDU2139 Calculate the formula【水题】

Calculate the formula Time Limit: 1000/1000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others) Total Submission(s): 7441 Accepted Submission(s): 2284 Problem Description You just need to calculate the sum of the formula: 1^2+3^2+5^2+--+

hdu 5139 Formula (找规律+离线处理)

Formula Time Limit: 4000/2000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others)Total Submission(s): 206 Accepted Submission(s): 83 Problem Description f(n)=(∏i=1nin−i+1)%1000000007You are expected to write a program to calculate f(n) w

bzoj 4451 : [Cerc2015]Frightful Formula FFT

4451: [Cerc2015]Frightful Formula Time Limit: 10 Sec Memory Limit: 64 MBSubmit: 177 Solved: 57[Submit][Status][Discuss] Description 给你一个n*n矩阵的第一行和第一列,其余的数通过如下公式推出: F[i,j]=a*f[i,j-1]+b*f[i-1,j]+c 求f[n][n]%(10^6+3) Input 第一行三个数n,a,b,c 第二行n个数,第i个表示f[i

bzoj1814: Ural 1519 Formula 1 2011-12-20

1814: Ural 1519 Formula 1Time Limit: 1 Sec Memory Limit: 64 MBSubmit: 263 Solved: 70[Submit][Status][Discuss]DescriptionRegardless of the fact, that Vologda could not get rights to hold the Winter Olympic games of 20**, it is well- known, that the

[BZOJ]|[Ura] Formula 1-----插头DP入门

1519. Formula 1 Time limit: 1.0 secondMemory limit: 64 MB Background Regardless of the fact, that Vologda could not get rights to hold the Winter Olympic games of 20**, it is well-known, that the city will conduct one of the Formula 1 events. Surely,

猜你喜欢

【数据结构】之二叉树的java实现

二叉树的定义: 二叉树是树形结构的一个重要类型.许多实际问题抽象出来的数据结构往往是二叉树的形式,即使是一般的树也能简单地转换为二叉树,而且二叉树的存储结构及其算法都较为简单,因此二叉树显得特别重要. ...

ajax个人

Ajax ajax 全称Asynchronous Javascript And XML(异步JavaScript和XML),创建交互式网页应用的网页开发技术. 交互式网页就是随用户登陆的时间和操作的 ...

android 4.4上chromium介绍

纠结啊为毛感觉只过了一天,就有两天没写了,今天晚上的机票离开上海.希望小叶子不会难为我. 昨晚研究了一下插卡小哥的业务流程,发现他并不是每个房间都插的,时间在每晚的9点到11点之间.估计是会在周边每个 ...

nginx 列出文件、目录

autoindex on; #自动显示目录默认为off autoindex_exact_size off; #人性化方式显示文件大小否则以byte显示默认为on autoindex_localti ...

【转】成为Java顶尖程序员，看这11本书就够了

成为Java顶尖程序员 ,看这11本书就够了转自:http://developer.51cto.com/art/201512/503095.htm 以下是我推荐给Java开发者们的一些值得一看的好书 ...

【JavaScript】JS中没有代码块的概念

1 <script> 2 3 var m = "roboce"; 4 if(m === "roboce"){ 5 var k = "hah ...

大型网站技术架构（四）--网站的高性能架构

大型网站技术架构(一)--大型网站架构演化大型网站技术架构(二)--架构模式大型网站技术架构(三)--架构核心要素网站性能是客观的指标,可以具体体现到响应时间.吞吐量.并发数.性能计数器等技术指 ...

（暴力+深搜）POJ - 2718 Smallest Difference

原题链接: http://poj.org/problem?id=2718 题意: 给你几个数字,可以分成两个子集,然后分别按一定顺序排列组成一个数,求出这两只值差的绝对值的最小值. 分析: 反正也是刷 ...

Battery Historian for windows环境搭建

Battery Historian for windows环境搭建简介:Battery historian是一款通过上传bugreport文件分析用户手机中App的电池耗电情况的工具. Batter ...

Twemproxy 分布式集群缓存代理服务器

Twemproxy 分布式集群缓存代理服务器是一个使用C语言编写.以代理的方式实现的.轻量级的Redis代理服务器, 它通过引入一个代理层,将应用程序后端的多台Redis实例进行统一管理, 使应用 ...

MyCAT常用分片规则之分片枚举

MyCAT支持多种分片规则,下面测试的这种是分片枚举.适用场景,列值的个数是固定的,譬如省份,月份等. 在这里,需定义三个值,规则均是在rule.xml中定义. 1. tableRule 2. fun ...

jquery 层级选择器

关于层级选择器. $("parent > child") 选择所有指定“parent”元素中指定的“child”的直接子项元素. parent :任何有效的选择器. chil ...

洛谷—— P1074 靶形数独

https://www.luogu.org/problem/show?pid=1074 题目描述小城和小华都是热爱数学的好学生,最近,他们不约而同地迷上了数独游戏,好胜的他们想用数独来一比高低.但 ...

Linux 平台PostGIS安装

1.前提条件: postgresql 9.6.1 已经通过源码方式安装完成并可成功运行. 2. other OS packets OS: CentOS 6.4 X64 X64: libxml2-dev ...

php 高并发下数据同步的问题

1.加锁缺点:降低性能优点:减少代码逻辑复杂度(题主现在这样超过1w条就删数据的逻辑,感觉看起来就点糟糕啊,如果整个系统一复杂,这样的来回写数据,你确定你的逻辑还维护得下去?建议题主梳理一下代码的逻辑 ...

Android开发中如何实现外部其他Activity类与指定fragment碎片的相互跳转

先说一下这个问题产生的背景 Activity A中有四个fragment,分别是 a b c d,默认显示的是fragment a.在开发过程中,fragment d中需要和外部Activity进行跳 ...

ha + lvs 整和

环境:关闭火墙时间同步 yum配置全 HA+lvs ( 高可用+附带均衡 ) VS RS ( rr ) | / ...

Spring Boot/Spring Cloud、ESB、Dubbo

如何使用Spring Boot/Spring Cloud 实现微服务应用spring Cloud是一个基于Spring Boot实现的云应用开发工具,它为基于JVM的云应用开发中的配置管理.服务发现. ...

国内和海外云计算平台差异分析

对于国内外云计算在发展方面的差异,以及当前国内外云计算发展的现状,盛大云弹性计算部副总监杜海给出了简单的解释,他认为差异化主要表现在两个方面:1.国内外云计算市场的成熟程度不同:2.用户对云计算和云产 ...

Redis的安装及配置

Redis安装及主从配置一.何为Redis redis是一个key-value存储系统.和Memcached类似,它支持存储的value类型相对更多,包括string(字符串).list(链表).s ...

专题

随机推荐

© 2024 憋错料 | info#biecuoliao.com | 10 q. 0.019 s.