一 BWA安装使用
下载编译BWA
#tar -jxvf bwa-0.5.7.tar.bz2
#make
BWA使用流程
Index the database file in the FASTA format
Find the suffix array (SA) coordinates of good hits of each individual read
Convert SA coordinates to chromosomal coordinate and pair reads
准备资料
Reference genome data (*.fa)
NGS Short reads data (*.fastq)
建立 Index
#bwa index reference.fa
寻找 SA coordinates
#bwa aln reference.fa leftRead.fastq > leftRead.sai
#bwa aln reference.fa rightRead.fastq > rightRead.sai
若是希望使用 multi threads 跑指令的话
#./bwa aln -c -t 3 -f leftreads.sai reference.fa leftreads.fastq
参数说明
* -f file:file to write output to instead of stdout
* -c:input sequences are in the color space
* -t num :number of threads. (初始值:1)
转换 SA coordinates
#bwa sampe reference.fa leftRead.sai rightRead.sai leftRead.fastq rightread.fastq > human.sam
Generate alignments in the SAM format given single-end reads
#./bwa samse -f leftreads.sam reference.fa leftreads.sai leftreads.fastq
#./bwa samse -f rightreads.sam reference.fa rightreads.sai rightreads.fastq
参数说明
* -f file:输出档案
* -n num: Maximum number of alignments to output in the XA tag for reads paired properly.(默认值为:3)
sam结果(bwa比对结果)
每行为一个read的比对结果,分为12字段
1 QNAME Query (pair) NAME
2 FLAG bitwise FLAG
3 RNAME Reference sequence NAME
4 POS 1-based leftmost POSition/coordinate of clipped sequence
5 MAPQ MAPping Quality (Phred-scaled)
6 CIAGR extended CIGAR string
7 MRNM Mate Reference sequence NaMe (‘=’ if same as RNAME)
8 MPOS 1-based Mate POSistion
9 ISIZE Inferred insert SIZE
10 SEQ query SEQuence on the same strand as the reference
11 QUAL query QUALity (ASCII-33 gives the Phred base quality)
12 OPT variable OPTional fields in the format TAG:VTYPE:VALUE
第12字段为比对结果详细记录,分类如下
NM Edit distance
MD Mismatching positions/bases
AS Alignment score
BC Barcode sequence
X0 Number of best hits
X1 Number of suboptimal hits found by BWA
XN Number of ambiguous bases in the referenece
XM Number of mismatches in the alignment
XO Number of gap opens
XG Number of gap extentions
XT Type: Unique/Repeat/N/Mate-sw
XA Alternative hits; format: (chr,pos,CIGAR,NM;)*
XS Suboptimal alignment score
XF Support from forward/reverse alignment
XE Number of supporting seeds
二 Bowtie安装使用