39、count_rpkm_fpkm_TPM

参考:https://f1000research.com/articles/4-1521/v1

https://www.biostars.org/p/171766/

http://www.rna-seqblog.com/rpkm-fpkm-and-tpm-clearly-explained/

It used to be when you did RNA-seq, you reported your results in RPKM (Reads Per Kilobase Million) or FPKM (Fragments Per Kilobase Million). However, TPM (Transcripts Per Kilobase Million) is now becoming quite popular.

============================fpkm====================================

rate = geneA_count / geneA_length

fpkm = rate / (sum(gene*_count) /10^6)

即: fpkm = 10^6 * (geneA_count / geneA_length)  /  sum(gene*_length)   ##sum(gene*_length) 没有标准化处理的所有基因的count总和。

============================TPM====================================

rate = geneA_count / geneA_length

tpm = rate / (sum(rate) /10^6)

即: tpm = 10^6 * (geneA_count / geneA_length)  /  sum(rate)   ##sum(gene*_length)

====================================================================

These three metrics attempt to normalize for sequencing depth and gene length. Here’s how you do it for RPKM:

  1. Count up the total reads in a sample and divide that number by 1,000,000 – this is our “per million” scaling factor.
  2. Divide the read counts by the “per million” scaling factor. This normalizes for sequencing depth, giving you reads per million (RPM)
  3. Divide the RPM values by the length of the gene, in kilobases. This gives you RPKM.

FPKM is very similar to RPKM. RPKM was made for single-end RNA-seq, where every read corresponded to a single fragment that was sequenced. FPKM was made for paired-end RNA-seq. With paired-end RNA-seq, two reads can correspond to a single fragment, or, if one read in the pair did not map, one read can correspond to a single fragment. The only difference between RPKM and FPKM is that FPKM takes into account that two reads can map to one fragment (and so it doesn’t count this fragment twice).

TPM is very similar to RPKM and FPKM. The only difference is the order of operations. Here’s how you calculate TPM:

  1. Divide the read counts by the length of each gene in kilobases. This gives you reads per kilobase (RPK).
  2. Count up all the RPK values in a sample and divide this number by 1,000,000. This is your “per million” scaling factor.
  3. Divide the RPK values by the “per million” scaling factor. This gives you TPM.

So you see, when calculating TPM, the only difference is that you normalize for gene length first, and then normalize for sequencing depth second. However, the effects of this difference are quite profound.

When you use TPM, the sum of all TPMs in each sample are the same. This makes it easier to compare the proportion of reads that mapped to a gene in each sample. In contrast, with RPKM and FPKM, the sum of the normalized reads in each sample may be different, and this makes it harder to compare samples directly.

Here’s an example. If the TPM for gene A in Sample 1 is 3.33 and the TPM in sample B is 3.33, then I know that the exact same proportion of total reads mapped to gene A in both samples. This is because the sum of the TPMs in both samples always add up to the same number (so the denominator required to calculate the proportions is the same, regardless of what sample you are looking at.)

With RPKM or FPKM, the sum of normalized reads in each sample can be different. Thus, if the RPKM for gene A in Sample 1 is 3.33 and the RPKM in Sample 2 is 3.33, I would not know if the same proportion of reads in Sample 1 mapped to gene A as in Sample 2. This is because the denominator required to calculate the proportion could be different for the two samples.

原文地址:https://www.cnblogs.com/renping/p/9206517.html

时间: 2024-10-18 14:22:16

39、count_rpkm_fpkm_TPM的相关文章

【足迹C++primer】39、动态内存与智能指针(3)

动态内存与智能指针(3) /** * 功能:动态内存与智能指针 * 时间:2014年7月8日15:33:58 * 作者:cutter_point */ #include<iostream> #include<vector> #include<memory> #include<string> using namespace std; /** 智能指针和异常 */ void f() { shared_ptr<int> sp(new int(42));

【足迹C++primer】39、动态内存与智能指针(2)

动态内存与智能指针(2) 直接管理内存 void fun1() { //此new表达式在自由空间构造一个int型对象,并返回指向该对象的指针 int *pi1=new int; //pi指向一个动态分配.未初始化的无名对象 string *ps3=new string; //初始化为空string int *pi2=new int; //pi指向一个未初始化的int int *pi3=new int(1024); //pi指向的对象的值为1024 string *ps4=new string(1

[ASP.NET MVC] 产生一维条码Barcode(Code 39、Code128、ISBN)

[ASP.NET MVC]Barcode 产生一维条码(Code 39.Code128.ISBN) 最近项目刚好要产生Code39一维条码,找到了这个Library BarcodeLib 支持多种(Code 128.Code 11.Code 39..等等) (图节录自http://www.codeproject.com/Articles/20823/Barcode-Image-Generation-Library) 说明: 1.先下载BarcodeLib 2.把BarcodeLib.dll加入参

QT学习-核心类列表-39、QtX11Extras 40、QtXml

39    -    QtX11Extras    -    Provides classes for developing for the X11 platform40    -    QtXml    -    Qt XML module provides C++ implementations of the SAX and DOM standards for XML    QDomAttr    -    Represents one attribute of a QDomElement 

39、重新复习js之三

1.盒子模型典型标签 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"

39、apk瘦身(转载)

本文转自::Android开发中文站 » 关于APK瘦身值得分享的一些经验 从APK的文件结构说起 APK在安装和更新之前都需要经过网络将其下载到手机,如果APK越大消耗的流量就会越多,特别是对于使用移动网络的用户来讲,消耗流量越多就代表需要花更多的钱去购买流量.同时一些第三方应用商城也会对上传的APK大小有限制,所以为了能够让产品能够更受商城和用户欢迎,APK瘦身是第一步,更小的APK标示着更多地用户愿意去下载和体验. 为了能够减小APK的大小,首先需要知道APK由哪些部分构成,然后针对每个部

ASCII码对应表chr(9)、chr(10)、chr(13)、chr(32)、chr(34)、chr(39)、chr(..

chr(9) tab空格       chr(10) 换行      chr(13) 回车        Chr(13)&chr(10) 回车换行       chr(32) 空格符       chr(34) 双引号       chr(39) 单引号 chr(33) !        chr(34) "        chr(35) #        chr(36) $        chr(37) %        chr(38) &        chr(39) '   

39、集合线程安全问题

由于Set.List和Map都是线程不安全的,为了同步控制,Collections类提供了多个synchronizedXxx()方法,该方法可以将指定集合包装成线程同步的集合,从而可以解决多线程并发访问集合时的线程安全问题,例如: 1 public class Test { 2 3 public static void main(String[] args) { 4 5 Collection c=Collections.synchronizedCollection(new ArrayList()

(转) ASCII码对应表chr(9)、chr(10)、chr(13)、chr(32)、chr(34)、chr(39)、chr(

chr(9) tab空格       chr(10) 换行      chr(13) 回车        Chr(13)&chr(10) 回车换行       chr(32) 空格符       chr(34) 双引号       chr(39) 单引号 chr(33) !        chr(34) "        chr(35) #        chr(36) $        chr(37) %        chr(38) &        chr(39) '