Chromosome coordinate systems: 0-based, 1-based

From:

https://arnaudceol.wordpress.com/2014/09/18/chromosome-coordinate-systems-0-based-1-based/

I’ve had hard time figuring out that different website and file formats are using different systems to represent genome coordinate.

Basically, the bases can be numerated in two way: starting at 0 or starting at 1. Those are the 0-based and 1-based coordinate system.

0-based:

ACTGACTG
012345678

1-based:

ACTGACTG
123456789

Then you say that the system is inclusive if the last index is part of the sequence or exclusive if it is not.

For instance to represent the sequence TGAC:

0-based inclusive: 2-5
1-based inclusive: 3-6
1-based exclusive: 3-7

I’ve tried to figure out which website-application are using each
coordinate system. The results can be found bellow. For each source, I
provide the URL of the reference website where I found the information,
and a caption where the system is described.

I found most of those links in Biostar (https://www.biostars.org/p/6373/) and on the blog of Casey M. Bergman (http://bergmanlab.smith.man.ac.uk/?p=36), who also wrote an article about this argument: https://www.landesbioscience.com/journals/mge/article/19479/.

Question:
“I am confused about the start coordinates for items in the refGene
table. It looks like you need to add “1” to the starting point in order
to get the same start coordinate as is shown by the Genome Browser. Why
is this the case?”

Response:
Our internal database representations of coordinates always have a
zero-based start and a one-based end. We add 1 to the start before
displaying coordinates in the Genome Browser. Therefore, they appear as
one-based start, one-based end in the graphical display. The refGene.txt file is a database file, and consequently is based on the internal representation.

We use this particular internal representation because it
simplifies coordinate arithmetic, i.e. it eliminates the need to add or
subtract 1 at every step. Unfortunately, it does create some confusion
when the internal representation is exposed or when we forget to add 1
before displaying a start coordinate. However, it saves us from much
trickier bugs. If you use a database dump file but would prefer to see
the one-based start coordinates, you will always need to add 1 to each
start coordinate.

If you submit data to the browser in position format
(chr#:##-##), the browser assumes this information is 1-based. If you
submit data in any other format (BED (chr# ## ##) or otherwise), the
browser will assume it is 0-based. You can see this both in our liftOver
utility and in our search bar, by entering the same numbers in position
or BED format and observing the results. Similarly, any data returned
by the browser in position format is 1-based, while data returned in BED
format is 0-based.

 

BED format uses zero-based, half-open
coordinates, so the first 25 bases of a sequence are in the range 0-25
(those bases being numbered 0 to 24)

The first three required BED fields are:

chrom – The name of the chromosome (e.g. chr3, chrY, chr2_random) or scaffold (e.g. scaffold10671).

chromStart – The starting position of the feature in the chromosome or scaffold. The first base in a chromosome is numbered 0.

chromEnd – The
ending position of the feature in the chromosome or scaffold. The
chromEnd base is not included in the display of the feature. For
example, the first 100 bases of a chromosome are defined as
chromStart=0, chromEnd=100, and span the bases numbered 0-99.

 

Lowest numeric position of the
reported variant on the genomic reference sequence. start:  Mutation
start coordinate (1-based coordinate system), end: Highest numeric
genomic position of the reported variant on the genomic reference
sequence. Mutation end coordinate (inclusive, 1-based coordinate
system).

时间: 2024-11-09 09:37:07

Chromosome coordinate systems: 0-based, 1-based的相关文章

User-Added Coordinate Systems in SimMechanics for unsupported Gear constraints

CAD Software Requirements This CAD assembly can be opened in SolidWorks? 2007 and later versions. Assembly and Exported XML File In this assembly, two gear parts mesh with a gear ratio of 2:1. The two gear parts are constrained by a Gear mate in the

wkid_WKID_Wkid_Geographic Coordinate Systems

Geographic Coordinate Systems Well-known ID Name 4001 GCS_Airy_1830 4002 GCS_Airy_Modified 4003 GCS_Australian 4004 GCS_Bessel_1841 4005 GCS_Bessel_Modified 4006 GCS_Bessel_Namibia 4007 GCS_Clarke_1858 4008 GCS_Clarke_1866 4009 GCS_Clarke_1866_Michig

wkid_WKID_Wkid_Projected Coordinate Systems

Projected Coordinate Systems Well-known ID Name 2000 Anguilla_1957_British_West_Indies_Grid 2001 Antigua_1943_British_West_Indies_Grid 2002 Dominica_1945_British_West_Indies_Grid 2003 Grenada_1953_British_West_Indies_Grid 2004 Montserrat_1958_British

【sqli-labs】 less2 GET - Error based - Intiger based (基于错误的GET整型注入)

与less1相同,直接走流程 提交参数,直接order by http://localhost/sqli/Less-2/?id=1 order by 1%23 http://localhost/sqli/Less-2/?id=-1 union select 1,2,3%23 http://localhost/sqli/Less-2/?id=-1 union select 1,database(),user()%23 http://localhost/sqli/Less-2/?id=-1 unio

Global and Local Coordinate Systems

ansys 中的坐标系 整体和局部坐标系(主要在建模中涉及) 整体坐标系是以你建模的整个建筑为一体,来确定坐标系的.比如你建一个矩形平面的建筑,整体坐标系一般默认水平方向为X轴,竖直方向为Y轴,以垂直图面的方向为Z轴.局部坐标系是以单个构件为概念的设置的.比如说对一个工字形的钢柱,局部坐标系的X轴通常为这个截面的强轴方向,对于工字形柱来说一般就是与腹板垂直的方向:其Y轴为截面的弱轴方向,对于工字形柱来说一般就是与腹板平行的方向:Z轴对于梁柱等杆系构件来说一般是沿杆件的长度方向,对于板构件来说一般

Physically Based Shader Development for Unity 2017 Develop Custom Lighting Systems

Part I: Introduction to Shaders In Unity Chapter 1: How Shader Development Works Chapter 2: Your First Unity Shader Chapter 3: The Graphics Pipeline Chapter 4: Transforming Coordinate Spaces Chapter 5: Your First Unity Lighting Shader Chapter 6: Spec

A real sense 3D face reconstruction system based on multi-view stereo vision

Abstract This paper proposed a system for a real sense 3D facial reconstruction method based on multi-view stereo vision generated using an orthographic 3D model. Multi-view stereopsis is an effective technology for expanding perspective and reducing

Why To Prefer Oil Based Stains Over Rest?

Majority of the people try to stain their deck themselves. While you think of staining wood, picture of a can of some oil based stain emerges in mind. And most of you follow the process of wiping the surface and staining it, let it soak in and then w

Install wxWidgets-3.0.2 on GNU/Linux Debian

转载自 http://www.binarytides.com/install-wxwidgets-ubuntu/ wxWidgets wxWidgets is an application development framework/library that allows developer to make cross platform GUI applications for Windows, Mac and Linux using the same codebase. Its primari