De Bruijn Graphs for Alternative Splicing and Repetitive Regions

意外发现的文章,等忙完这阵子,准备全都翻译出来!

==============================================

原作者文章链接:http://www.homolog.us/blogs/blog/2011/08/09/de-bruijn-graphs-for-alternative-splicing-and-repetitive-regions/

De
Bruijn Graphs for Alternative Splicing and Repetitive Regions

Today
we shall examine de Bruijn graphs for two structures that occur frequently in
genomes or transcriptomes. The reason for studying them together will be
apparent by the end of this post.

Let
us first construct a graph for two alternatively spliced transcripts A and B for
a gene. The regions shown in yellow and red are transcribed in both isoforms,
whereas the green region is present only in A.

The
de Bruijn graph is shown in circle and arrow format, and the paths for two
transcripts are marked by dotted lines. We shall explain the graph construction
qualitatively instead of going into nucleotide level detail. We recommend you to
pick your favorite gene and do the detailed construction yourself by following
rules explained
earlier
.

For
large parts of yellow and red regions, K-mers are common between two
transcripts. Therefore, their de Bruijn graph will connect sets of common nodes.
The green region of A will generate many new K-mers and follow a path similar to
blue upper branch of the de Bruijn graph. It is important to note that B will
also generate K-mers not present in A. They are junction K-mers spanning between
yellow and red junctions. Hence, de Bruijn graph of B will follow lower blue
branch.

Next,
we construct de Bruijn graph for repetitive segment of a genome. In the
following figure, two green regions are identical and yellow, blue and red
regions are all distinct.

The
de Bruijn graph for the segment is also shown with nodes for different regions
marked in respective colors. The coloring is somewhat simplified, because it
paints various junction K-mers with single colors. However, the topography of
construction is accurate, and we recommend you to pick a simple example and try
the construction yourself.

From
a cursory look of above figures, you may think that de Bruijn graphs of
alternatively spliced genes and repetitive segments are identical. Are they?

Please
pay close attention to the direction of the arrows and you will see the
difference. De Bruijn graphs are directed graphs, where flipping an arrow can
completely change the meaning of the graph. For alternative splicing, all arrows
are going from left to right. For repetitive structure, arrows connecting blue
circles in the figure go from right to left.

Another
interesting observation – the first graph can be uniquely resolved into
structures A and B, but the second graph cannot. For example, the de Bruijn
graph of the following repetitive genomic segment also has the same de Bruijn
structure as one considered earlier. Therefore, the graph shown here can resolve
to many possible structures in nucleotide space. This multiplicity appears from
presence of loop in the de Bruijn graph.

De Bruijn Graphs for Alternative Splicing and Repetitive
Regions

时间: 2024-10-30 09:25:26

De Bruijn Graphs for Alternative Splicing and Repetitive Regions的相关文章

How do sequencing errors affect de Bruijn graphs?

意外发现的文章,等忙完这阵子,准备全都翻译出来! ============================================== 原作者文章链接:http://www.homolog.us/blogs/blog/2011/08/01/how-do-sequencing-errors-affect-de-bruijn-graphs/ How do sequencing errors affect de Bruijn graphs? Today's commentary is the

De Bruijn 图 – II

意外发现的文章,等忙完这阵子,准备全都翻译出来! ============================================== 原作者文章链接:http://www.homolog.us/blogs/blog/2011/07/29/de-bruijn-graphs-ii/ De Bruijn graphs – II In the previous post, we discussed how de Bruijn graphs can be constructed for a ge

De Bruijn序列

最近文章中经常出现及De Bruijin 这个关键字,网上搜索了一下,记录下来. De Bruijn序列 问题:能否构造一个长度为2的n次方的二进制环状串,使得二进制环状串中总共2的n次方个长为n的不同截断作为2的n次方个长为n的二进制串来说互不相同.1946年,荷兰数学家De Bruijn解决了这个问题. 这种序列,就是De Bruijn序列. 例如,当n为2时,这样的环状串可以是:0011,它的4个长度为2的子串分别为00.01.11.10,这四个子串互不相同.详情参考:http://en.

单细胞转录组测序数据的可变剪接(alternative splicing)分析方法总结

可变剪接(alternative splicing),在真核生物中是一种非常基本的生物学事件.即基因转录后,先产生初始RNA或称作RNA前体,然后再通过可变剪接方式,选择性的把不同的外显子进行重连,从而产生不同的剪接异构体(isoform).这种方式,使得一个基因可产生多个不同的转录本,这些转录本分别在细胞/个体分化发育的不同阶段,在不同的组织中有各自特异的表达和功能,从而极大地丰富了编码RNA和非编码RNA种类和数量,进而增加了转录组和蛋白质组的复杂性. 可变剪接主要有以下五种常见的形式: 1

corsetjiedu

Corset: enabling differential gene expression analysis for de novo assembled transcriptomes 背景: 转录组测序这种高通量RNA测序,是一个非常强力的技术 去研究转录本的各个方面 it has a broad range of applications 它有着广泛的应用 包括发现新的基因,检测可变剪接,差异表达基因,基因融合检测,比如SNPs和转录后的编辑post- transcriptional edit

13、基因组的拼接原理(转载沈梦圆的博客)

最近学习了一下基因组的拼接原理,以下是我的学习笔记和一些思考.基因组的拼接原理是高通量测序技术的基础知识吧,我个人认为即使不做基 因组拼接工作,也可以学习一下几个主流拼接软件的算法和原理.我主要是学习了两个网上教程,其教程出处为https://github.com/ TGAC/361Division/tree/master/de_novo_2016和https://github.com/ lexnederbragt/INF-BIO9120_fall2013_de_novo_assembly/tr

Test Design Techniques - STATE BASED TESTING

Test Design Techniques - STATE BASED TESTING -Test note of “Essential Software Test Design” 2015-08-19 Content: 13.1 The Model  13.1.1 The ATM Machine13.2 Creating Base Test Cases  13.2.1 Ways of Covering the Graph  13.2.2 Coverage According to Chow 

测序简史

测序简史 一代 二代测序 三代测序 一文从一代到最新的测序技术,希望能够帮助你. 序 这几天天气很热,热的人心惶惶.因此一直提上日程的所谓的测序简史,也没有时间去好好的落实.中途找过一个行业内的颇有影响力的人,但是他由于种种原因,也没有能踏踏实实的去做这件事情.几经周折,这个任务还是落到了我自己的肩上. 于是乎,我鼓鼓勇气,尝试着去把这段从1977年到2017年的漫长而又渺小的四十年说的有趣些儿. 当我起笔去写这篇文章的时候,小伙伴们还在工作室因为某个服务器后台技术争论不了,这样看来生信人团队还

Markov Random Fields

We have seen that directed graphical models specify a factorization of the joint distribution over a set of variables into a product of local conditional distributions. They also define a set of conditional independence properties that must be satisf