动作识别之APJ3D和随机森林

Human Action Recognition Using APJ3D and Random
Forests



方法概述:

First, we extract the 3D skeletal joint locations
from depth images. The APJ3D computed from the
action depth image sequences by employing the 3D joint position features and the 3D
joint angle features, and then clustered
into K-means algorithm, which represent the typical
postures of actions. By employing the improved Fourier
Temporal Pyramid, we recognize actions using random
forests.

通过从kinect的骨骼点信息,提取3D 点的位置特征和3D点的角度特征,并用二者构建新特征 —— APJ3D

手工选择15个关节点(能承受小的扰动)

从训练数据中提取出的APJ3D向量要通过Kmeans聚类,傅里叶时空金字塔,随机森林最后获得识别结果


动作识别的三大挑战:

First is description of human action.
Human action in the video sequence
is a dynamic process that characterized not only with
each frame of the body posture, but also with these the emergence of gesture
sequences and continuous time. And
even with a type of action, different individuals at the completion of the action
of the process will be different
due to the different height, shape, agility and so on. Therefore, on human
action identification process, how
to quickly extract simple but effective features is
still facing a great difficulty in
human action recognition. Second
is representation model of human action, the relatively
large changes in human action, but also has a strong
combination of structural features, and how to combine
these characteristics, design a strong distinction between the ability of the
action of the model is an important
issue in human action recognition. Third is efficient
action classification algorithm design, action recognition
has a high data dimension, training data acquisition
difficulties characteristics, we hope that the behavioral
categories algorithm has the training and classification
speed, good effect, generalization ability characteristics.


特征提取:

首先选择20个关节点:hip
center, spine, shoulder center, head,
L/ R shoulder, L/ R elbow, L/ R wrist, L/ R hand,
L/ R hip, L/ R knee, L/ R angle and L/ R foot.

Among these joints, hand
and wrist and foot and ankle are very close to each
other and thus superfluous for the characterization
of body part constitution.

所以最终确定的15个关节点:head, neck, L/ R shoulder, L/ R
elbow, L/ R hands, L/ R knee,
L/ R feet, torso center and L/ R hip.

从人面对kinect的方向,判断出左右肢体

节点角度

每个关节点有其几何位置(全局笛卡尔坐标系中)

The joints
contiguous to the torso are usually called
first-degree joints, while joints contiguous to firstdegree joints are
classified as second-degree joints. Firstdegree joints include the elbows,
the knees and the head, while
second-degree joints are the extremities: the hands and feet.

每一个关节点有两个自由度:a
zenith angle θ and an azimuth angle
μ (相连两点的距离保持不变)

角度信息的获取需要将每个joint的全局坐标转化成局部坐标 —— 论文没说清,我理解应该是,从torso basis
计算出坐标系的方向和尺度(正则化),进而计算出相互连接的第一度,第二度节点

节点位置

The pairwise relative positions
of the joints results in more discriminative features
for representing the human movement is our key suggestion. Due to the coordinates are
normalized, so the motion is invariant
to the absolute body position, the initial body orientation
and the body size.

For each joint i , we extract the pairwise relative position
features by taking the difference between the position
of joint i and that of each other joint j: 

The 3D joint feature for joint i is defined as: 

APJ3D

用同样的torso basis
来计算第一度节点

用旋转后的标准正交的 torso basis
的信息计算第二度节点:比如,定义右肩膀-右肘为V,定义右肘-右手为W,要获取右手的特征。首先旋转torso
basis 这样,被旋转后的坐标基就移动到右肘上,然后定义球坐标系,

每一个节点对应球坐标系中的两个坐标,然后We also
compute the
angle η between the directional vector z from  the RGB-D sensor and the inverted
vector  t ?from the torso basis,
to detect torso inclinations. 最后的身体节点角度信息表示为:

Afterward, we select the pairwise relative position features
as

—— m the relative position between the torso center and
the hands

—— n the relative position between the torso center and
the feet

Thus, we use vector  to act as the
features for action.

最后的APJ3D 特征信息:


傅里叶金字塔

we propose to use the improved Fourier Temporal Pyramid
to represent the temporal dynamics of these frame-level
features, and to solve the problem of temporal
interval.

每个动作表现为APJ3D特征的连续变化序列,通过Kmeans聚类,每个动作被表示成一系列的
key postures

In order to capture the temporal
structure of the action, apart from the global Fourier
coefficients, we recursively partition the action into
a pyramid, and use the short time Fourier transform for all the segments
。Thefinal feature is the concatenation
of the Fourier coefficients from
all the segments.

改进方法如下:

For each key
posture s, let  denote its
overall feature vector where p is its
3D pairwise position vector and
vis its 3D joint angle vector.

Note that each element g is
a function of time and we can write it as   . For each time segment at each
pyramid level, we use Short Fourier
Transform  to element  and acquire
its Fourier
coefficients, and we utilize its high-frequency and low-frequency coefficients
 as features.

低频的特征可以保持对噪声的鲁棒,高频特征可表示动作的突变

经过傅里叶变换之后,对暂时扰动不再敏感because time series with
temporal translation have
the same Fourier coefficient magnitude, and the temporal
structure of the actions can be characterized by the pyramid
structure

实现中将动作分为4层金字塔


随机树训练

extract features from the training
sets are trained with the random forests
classifier, and assembled by a set of randomized decision
trees. In each decision tree, W segment features are
randomly selected from the training sets and put at a root node, and mapped to a
set of termination leaf nodes by
the interior binary splitting joints.

At each interior joint,
 f variables  are  randomly selected out of the
Ffeature
dimension and the decision threshold  T is correspondingly
chosen in the range The
splitting

function is defined as:

To measure the training quality of each leaf node, the proportion of segments from
sequences of a same action

falling into the same leaf node, the information gain is defined at each split
node:

信息增益

In the testing stage, each segment
feature is pushed to the root node
of each decision tree in the random forests classifier, and eventually forwarded to a
terminating leaf node. The path
between a root node and a terminating leaf node consists
of a set of split nodes, and each split node contains
a binary splitting function.

When the
segment feature drops into a terminating leaf node, a
histogram  Prefers
to the proportion of segments per class label that
fall into this leaf node during training stage, which is the soft
voting result at the decision tree Finally,
the prediction histogram of the whole forests is acquired
by summing up the voting histograms from all the
decision trees: 

因为加入了傅里叶变换,整个识别系统的抗噪声能力是杠杠滴~~



http://ojs.academypublisher.com/index.php/jsw/article/view/jsw080922382245

来自为知笔记(Wiz)

动作识别之APJ3D和随机森林,布布扣,bubuko.com

时间: 2024-10-25 14:22:29

动作识别之APJ3D和随机森林的相关文章

10.集成学习与随机森林

1.什么是集成学习 什么是集成学习,以前我们都是使用一个算法来进行预测,难免会有"独断专行"的感觉.集成学习是将多个算法集成在一块,然后多个算法对同一个问题进行预测,然后少数服从多数,这便是集成学习. 我们生活中有很多集成学习的例子,比如买东西的时候看推荐,如果10个人推荐你买A产品,但是只有1个人推荐你买B产品,我们会更将倾向于买B产品. 我们看看sklearn是如何为我们提供集成学习的接口的. from sklearn.datasets import make_moons from

RandomForest随机森林总结

1.随机森林原理介绍 随机森林,指的是利用多棵树对样本进行训练并预测的一种分类器.该分类器最早由Leo Breiman和Adele Cutler提出,并被注册成了商标.简单来说,随机森林就是由多棵CART(Classification And Regression Tree)构成的.对于每棵树,它们使用的训练集是从总的训练集中有放回采样出来的,这意味着,总的训练集中的有些样本可能多次出现在一棵树的训练集中,也可能从未出现在一棵树的训练集中.在训练每棵树的节点时,使用的特征是从所有特征中按照一定比

随机森林——Random Forests

[基础算法] Random Forests 2011 年 8 月 9 日 Random Forest(s),随机森林,又叫Random Trees[2][3],是一种由多棵决策树组合而成的联合预测模型,天然可以作为快速且有效的多类分类模型.如下图所示,RF中的每一棵决策树由众多split和node组成:split通过输入的test取值指引输出的走向(左或右):node为叶节点,决定单棵决策树的最终输出,在分类问题中为类属的概率分布或最大概率类属,在回归问题中为函数取值.整个RT的输出由众多决策树

随机森林(原理/样例实现/参数调优)

决策树 1.决策树与随机森林都属于机器学习中监督学习的范畴,主要用于分类问题. 决策树算法有这几种:ID3.C4.5.CART,基于决策树的算法有bagging.随机森林.GBDT等. 决策树是一种利用树形结构进行决策的算法,对于样本数据根据已知条件或叫特征进行分叉,最终建立一棵树,树的叶子结节标识最终决策.新来的数据便可以根据这棵树进行判断.随机森林是一种通过多棵决策树进行优化决策的算法. 2.案例: 图 1 是一棵结构简单的决策树,用于预测贷款用户是否具有偿还贷款的能力.贷款用户主要具备三个

机器学习——随机森林

基础概念 随机森林是用随机的方式建立一个森林,森林里面有很多的决策树,并且每一棵决策树之间没有关联.也可说随机森林是决策树的组合模型,其中决策树的组合形式采用的是bagging的方式. Bagging和Boosting(补充理解) Bagging方法: 从整体样本集合中,抽取n(n<整体)个样本,并进行k轮抽取,得到k个数据集 对k个数据集,训练k个模型 结果选择:对于分类问题,结果选取分类器投票数最多的结果:对于回归问题:由k个模型预测结果的均值作为最后预测结果 Boosting方法: 没有先

机器学习第一步——用逻辑回归及随机森林实现泰坦尼克号的生存预测

1.实验背景 本次实验是Kaggle上的一个入门比赛——Titanic: Machine Learning from Disaster.比赛选择了泰坦尼克号海难作为背景,并提供了样本数据及测试数据,要求我们根据样本数据内容建立一个预测模型,对于测试数据中每个人是否获救做个预测.样本数据包括891条乘客信息及获救情况,测试数据有418条乘客信息.样本数据的样例如下: Passenger:乘客唯一识别id Survived:是否存活,0为否,1为是 Pclass:船舱等级,1.2.3等 Name:姓

决策树 随机森林 adaboost

? 熵.互信息? 决策树学习算法 ? 信息增益 ? ID3.C4.5.CART? Bagging与随机森林? 提升 ? Adaboost/GDBT ? 熵.互信息 熵是对平均不确定性的度量. 平均互信息:得知特征Y的信息而使得对标签X的信息的不确定性减少的程度.描述随机变量之间的相似程度.(条件熵.相对熵:差异性) ? 决策树 决策树学习采用的是自顶向下的递归方法,有监督学习. 其基本思想是以信息熵为度量构造一棵熵值下降最快的树,到叶子节点处的熵值为零,此时每个叶节点中的实例都属于同一类. 建立

机器学习中的算法(1)-决策树模型组合之随机森林与GBDT

版权声明: 本文由LeftNotEasy发布于http://leftnoteasy.cnblogs.com, 本文可以被全部的转载或者部分使用,但请注明出处,如果有问题,请联系[email protected] 前言: 决策树这种算法有着很多良好的特性,比如说训练时间复杂度较低,预测的过程比较快速,模型容易展示(容易将得到的决策树做成图片展示出来)等.但是同时,单决策树又有一些不好的地方,比如说容易over-fitting,虽然有一些方法,如剪枝可以减少这种情况,但是还是不够的. 模型组合(比如

mahout 随机森林RF算法

在随机森林中的随机性体现在:1.训练数据的随机性 2. 选择分割属性的随机性 能解决分类与回归问题,并且都有很好的估计表现 1.生成数据说明文件 mahout describe -p input.csv -f input.info-d2 I 3 N I 5 N I 3 C L(执行describe生成数据的说明文件) 2.训练模型 mahout buildforest -d input.csv -ds input.info -sl 5 -p -t 5 -o forest_result(生成随机森