Learning in Two-Player Matrix Games

3.2 Nash Equilibria in Two-Player Matrix Games

For a two-player matrix game, we can set up a matrix with each element containing a reward for each joint action pair. Then the reward function for player becomes a matrix.

A two-player matrix game is called a zero-sum game if the two player are fully competitive. In this way, we have . A zero-sum game has a unique NE in the sense of the expected reward. This means that, although each player may have multiple NE strategies in a zero-sum game, the value of the expected reward under these NE strategies will be the same. A general-sum matrix game refers to all types of matrix games. In a general-sum matrix game, the NE is no longer unique and the game might have multiple NEs.

For a two-player matrix game, we define as the set of all probability distributions over player ‘s action set . Then becomes

(1)

An NE for a two-player matrix game is the strategy pair for two players such that, for

(2)

where denotes any other player than player , and is the set of all probability distributions over player ‘s action set .

Given that each player has two actions in the game, we can define a two-player two-action general-sum game as

(3)

where and denote the reward to the row player (player 1) and the reward to the column player (player 2), respectively. The row player chooses action and the column player chooses action . the pure strategies and are called a strict NE in pure strategies if

(4)

where and denote any row other than row and any column other than column ,respectively.

时间: 2024-10-12 15:36:18

Learning in Two-Player Matrix Games的相关文章

hdu5612 Baby Ming and Matrix games (dfs加暴力)

Baby Ming and Matrix games Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/65536 K (Java/Others)Total Submission(s): 849    Accepted Submission(s): 211 Problem Description These few days, Baby Ming is addicted to playing a matrix game.

Deep Reinforcement Learning from Self-Play in Imperfect-Information Games

Heinrich, Johannes, and David Silver. "Deep reinforcement learning from self-play in imperfect-information games." arXiv preprint arXiv:1603.01121(2016). 这篇文章提出了基于深度学习的自我博弈达到纳什均衡的训练方法.这个方法避免了人为的先验知识的误导,采用了端到端的训练方式,达到了人类专家级水平. 方法: 通过自我博弈产生训练数据,用来

Baby Ming and Matrix games(dfs计算表达式)

Baby Ming and Matrix games Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/65536 K (Java/Others) Total Submission(s): 1210    Accepted Submission(s): 316 Problem Description These few days, Baby Ming is addicted to playing a matrix game

HDU 5612 Baby Ming and Matrix games

暴力搜索,据说精度卡的紧...但我是double过了的. #include<cstdio> #include<cstring> #include<vector> #include<cmath> #include<queue> #include<list> #include<algorithm> using namespace std; const double eps=1e-8; int dir[4][2],t[4][2]

Teaching Your Computer To Play Super Mario Bros. – A Fork of the Google DeepMind Atari Machine Learning Project

Teaching Your Computer To Play Super Mario Bros. – A Fork of the Google DeepMind Atari Machine Learning Project Posted by ehrenbrav on August 25, 2016Leave a comment (14)Go to comments For those who want to get right to the good stuff, the installati

深度强化学习(Deep Reinforcement Learning)入门:RL base &amp; DQN-DDPG-A3C introduction

转自https://zhuanlan.zhihu.com/p/25239682 过去的一段时间在深度强化学习领域投入了不少精力,工作中也在应用DRL解决业务问题.子曰:温故而知新,在进一步深入研究和应用DRL前,阶段性的整理下相关知识点.本文集中在DRL的model-free方法的Value-based和Policy-base方法,详细介绍下RL的基本概念和Value-based DQN,Policy-based DDPG两个主要算法,对目前state-of-art的算法(A3C)详细介绍,其他

[it-ebooks]电子书列表

#### it-ebooks电子书质量不错,但搜索功能不是很好 #### 格式说明  [ ]中为年份      ||  前后是标题和副标题  #### [2014]: Learning Objective-C by Developing iPhone Games || Leverage Xcode and Objective-C to develop iPhone games http://it-ebooks.info/book/3544/ Learning Web App Developmen

(转) [it-ebooks]电子书列表

[it-ebooks]电子书列表 [2014]: Learning Objective-C by Developing iPhone Games || Leverage Xcode and Objective-C to develop iPhone games http://it-ebooks.info/book/3544/Learning Web App Development || Build Quickly with Proven JavaScript Techniques http://

SolrCloud部署和使用手册

SolrCloud部署和使用手册             文档修订摘要   日期 版本 描述 著者 审阅者 2013-12-23 0.1 将txt简易模板的文档提取到word模板. 张乐雷 2013-12-23 0.2 创建collection的url中制定了createNodeSet 张乐雷 2013-12-29 0.3 1.  solr.war直接使用solr发布的文件,不在进行修改. 2.  日志jar和配置放置到tomcat/lib目录 3.  新增维护document的命令,提供了不同