3. Shared Decision Tree Context Clustering
3.1 Training of Average Voice Model
- A block diagram of the training stage of average voice model using the proposing technique is shown in Fig. 2. First, context dependent models without context clustering are separately trained for respective speakers to derive a decision tree for context clustering common to these speaker dependent models. Then, the decision tree, which we refer to as a shared decision tree, is constructed using an algorithm described in Sect. 3.3 from the speaker dependent models. Finally, all speaker dependent models are clustered using the shared decision tree. A Gaussian pdf of average voice model is obtained by combining all speakers’ Gaussian pdfs at every node of the tree. After the reestimation of parameters of the average voice model using training data of all speak- ers, state duration distributions is obtained for each speaker. Finally, state duration distributions of the av- erage voice model is obtained by applying the same procedure.
- Fig. 2中提出的技术,来训练average voice model
- 上下文相关的模型,没有经过上下文聚类,被分别训练
时间: 2024-11-08 05:03:13