augustus, gene prediction, trainning

做基因组注释

先用augustus训练,然后再用maker

网址:

http://bioinf.uni-greifswald.de/augustus/

可以在线分析

也可以本地。

在线训练网址:

http://bioinf.uni-greifswald.de/webaugustus/training/create

You have to give a species name(不能有空格!), and a genome file!

关于参考基因组 和cDNA fasta文件的head要求:

  • no whitespaces in the headers
  • no special characters in the headers (e.g. !#@&|;)
  • make the headers as short as possible
  • let headers not start with a number but with a letter
  • let headers contain letters and numbers, only

In the following we give some header examples that will not cause problems:

>entry1
>contig1000
>est20
>scaffold239

详细的在线训练指导:

http://bioinf.uni-greifswald.de/webaugustus/trainingtutorial.gsp

如果在线训练基因组大小和cDNA大小均不能超过100M,所以菌类可以用。植物的话还是用本地训练:

/share/bioinfo/zhangxt/software/augustus-2.4/scripts/autoAug.pl --species=Carya --genome=../Carya.fa --cdna=../Carya_400cDNA.fa

如果提示没有加环境变量:

vi ~/.bash_profile

export AUGUSTUS_CONFIG_PATH=/share/bioinfo/zhangxt/software/augustus-2.4/config

source ~/.bash_profile

freemao

FAFU

时间: 2024-07-30 05:50:22

augustus, gene prediction, trainning的相关文章

Augustus 进行基因注释

目前的从头预测软件大多是基于HMM(隐马尔科夫链)和贝叶斯理论,通过已有物种的注释信息对软件进行训练,从训练结果中去推断一段基因序列中可能的结构,在这方面做的最好的工具是AUGUSTUS它可以仅使用序列信息进行预测,也可以整合EST, cDNA, RNA-seq数据作为先验模型进行预测. 安装 安装较为复杂,可选用conda进行安装 使用 (1)若存在已经被训练的物种(augustus --species=help查看),则直接使用一下代码进行预测基因,以拟南芥为例: 1 augustus --

maker 2008年发表在genome Res

简单好用 identify repeats, to align ESTs and proteins to the genome, and to automatically synthesize these data into feature-rich gene annotations, including alternative splicing and UTRs, as well as attributes such as evidence trails, and confidence mea

21 、GPD PSL

1.Variant Call Format(VCF) Example ##fileformat=VCFv4.0 ##fileDate=20110705 ##reference=1000GenomesPilot-NCBI37 ##phasing=partial ##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Data"> ##INFO=<ID=DP,Number=1,Type

条件随机场介绍(7)—— An Introduction to Conditional Random Fields

参考文献 [1] S.M.AjiandR.J.McEliece,"Thegeneralizeddistributivelaw,"IEEETrans- actions on Information Theory, vol. 46, no. 2, pp. 325–343, 2000. [2] Y.Altun,I.Tsochantaridis,andT.Hofmann,"HiddenMarkovsupportvector machines," in Internation

snap

1.snap的下载与安装 snap的说明文档: /home/share/biosoft/snap/00README 下载: wget http://korflab.ucdavis.edu/Software/snap-2013-11-29.tar.gz 文件说明: DNA Contains some sample sequences HMM Contains SNAP parameter files LICENSE The GNU General Public License Makefile F

计算Gene co-expression features

Gene co-expression features 下载 co-expression 数据 The following co-expression coefficient features were attained from COXPRESdb. http://coxpresdb.jp/download.shtml 打开这个页面我们点击bulk download 然后我们下载budding yeast 文件. 在最下面我们也可以看到文件格式的说明 Under the directory n

POJ 1080 Human Gene Functions(LCS)

Description It is well known that a human gene can be considered as a sequence, consisting of four nucleotides, which are simply denoted by four letters, A, C, G, and T. Biologists have been interested in identifying human genes and determining their

学习文章题目-Transfer learning for cross-company software defect prediction

所选主题:缺陷预测 论文题目: 1. Using class imbalance learning for software defect prediction 或 2.Transfer learning for cross-company software defect prediction 作者: 1. Wang Shuo, Yao Xin 2. Ying Ma, Guangchun Luo, Xue Zeng, Aiguo Chen 期刊: 1. IEEE transactions on

Intra Luma Prediction

Intra Luma Prediction 在宏块的帧内预测过程中,有四种宏块类型:I_4x4,I_8x8,I16x16,I_PCM.他们都需要在相邻块做去块滤波之前进行帧内预测. 下面为亮度帧内预测的总体流程 1-4获取当前block的帧内预测模式的预测,5-7获得最佳预测模式并对应预测模式的预测做后续处理 首先需要获得当前4x4(8x8)预测块有左.上的4x4(8x8)相邻块A.B,假设其所在宏块为mbAddrA.mbAddrB. 如果mbAddrA或mbAddrB中任意一个宏块不可用于帧内