使用innodb_ruby探查Innodb索引结构

  innodb_ruby 是使用 Ruby 编写的 InnoDB 文件格式解析器。innodb_ruby 的目的是暴露一些其他隐藏的 InnoDB 原理。

  innodb_ruby不适合使用于生产环境,但可以作为学习工具来使用。

  • ①、安装

  以下安装参考MySQL大师知数堂吴炳锡老师的blog.

    下载

[[email protected]_L1 mysql]# wget https://cache.ruby-china.org/pub/ruby/ruby-1.9.3-p551.tar.gz
--2017-01-19 14:06:35--  https://cache.ruby-china.org/pub/ruby/ruby-1.9.3-p551.tar.gz
Resolving cache.ruby-china.org... 183.61.64.66, 121.201.98.27, 2405:fd80:110:0:d63d:7eff:fe73:c46, ...
Connecting to cache.ruby-china.org|183.61.64.66|:443... connected.
ERROR: certificate common name “*.b0.upaiyun.com” doesn’t match requested host name “cache.ruby-china.org”.
To connect to cache.ruby-china.org insecurely, use ‘--no-check-certificate’.

[[email protected]_L1 mysql]# wget --no-check-certificate https://cache.ruby-china.org/pub/ruby/ruby-1.9.3-p551.tar.gz
--2017-01-19 14:07:31--  https://cache.ruby-china.org/pub/ruby/ruby-1.9.3-p551.tar.gz
Resolving cache.ruby-china.org... 121.201.98.27, 183.61.64.66, 2405:fd80:110:0:d63d:7eff:fe73:165a, ...
Connecting to cache.ruby-china.org|121.201.98.27|:443... connected.
WARNING: certificate common name “*.b0.upaiyun.com” doesn’t match requested host name “cache.ruby-china.org”.
HTTP request sent, awaiting response... 200 OK
Length: 12605119 (12M) [application/octet-stream]
Saving to: “ruby-1.9.3-p551.tar.gz”

100%[=================================================================================================================>] 12,605,119   287K/s   in 47s     

2017-01-19 14:08:23 (262 KB/s) - “ruby-1.9.3-p551.tar.gz” saved [12605119/12605119]

    安装依赖

[[email protected]_L1 mysql]# yum -y install zlib-devel curl-devel openssl-devel httpd-devel apr-devel apr-util-devel

    解压

[[email protected]_L1 mysql]# tar -zxvf ruby-1.9.3-p551.tar.gz 

    配置&安装

[[email protected]_L1 ruby-1.9.3-p551]# ./configure
[[email protected]_L1 ruby-1.9.3-p551]# make && make install
[[email protected]_L1 ruby-1.9.3-p551]# gem install innodb_ruby
  • ②、innodb_ruby的使用说明

    工欲善其事,必先利其器,使用之前要先查看帮助,知数堂吴炳锡老师的blog上也有介绍作者的github地址。但更详细的使用方法还是使用--help进行查看

[[email protected]_L1 data]# innodb_space --help

Usage: innodb_space <options> <mode>

Invocation examples:

  innodb_space -s ibdata1 [-T tname [-I iname]] [options] <mode>
    Use ibdata1 as the system tablespace and load the tname table (and the
    iname index for modes that require it) from data located in the system
    tablespace data dictionary. This will automatically generate a record
    describer for any indexes.

  innodb_space -f tname.ibd [-r ./desc.rb -d DescClass] [options] <mode>
    Use the tname.ibd table (and the DescClass describer where required).

The following options are supported:

  --help, -?
    Print this usage text.

  --trace, -t
    Enable tracing of all data read. Specify twice to enable even more
    tracing (including reads during opening of the tablespace) which can
    be quite noisy.

  --system-space-file, -s <arg>
    Load the system tablespace file or files <arg>: Either a single file e.g.
    "ibdata1", a comma-delimited list of files e.g. "ibdata1,ibdata1", or a
    directory name. If a directory name is provided, it will be scanned for all
    files named "ibdata?" which will then be sorted alphabetically and used to
    load the system tablespace.

  --table-name, -T <name>
    Use the table name <name>.

  --index-name, -I <name>
    Use the index name <name>.

  --space-file, -f <file>
    Load the tablespace file <file>.

  --page, -p <page>
    Operate on the page <page>.

  --level, -l <level>
    Operate on the level <level>.

  --list, -L <list>
    Operate on the list <list>.

  --require, -r <file>
    Use Ruby‘s "require" to load the file <file>. This is useful for loading
    classes with record describers.

  --describer, -d <describer>
    Use the named record describer to parse records in index pages.

The following modes are supported:

  system-spaces
    Print a summary of all spaces in the system.

  data-dictionary-tables
    Print all records in the SYS_TABLES data dictionary table.

  data-dictionary-columns
    Print all records in the SYS_COLUMNS data dictionary table.

  data-dictionary-indexes
    Print all records in the SYS_INDEXES data dictionary table.

  data-dictionary-fields
    Print all records in the SYS_FIELDS data dictionary table.

  space-summary
    Summarize all pages within a tablespace. A starting page number can be
    provided with the --page/-p argument.

  space-index-pages-summary
    Summarize all "INDEX" pages within a tablespace. This is useful to analyze
    page fill rates and record counts per page. In addition to "INDEX" pages,
    "ALLOCATED" pages are also printed and assumed to be completely empty.
    A starting page number can be provided with the --page/-p argument.

  space-index-pages-free-plot
    Use Ruby‘s gnuplot module to produce a scatterplot of page free space for
    all "INDEX" and "ALLOCATED" pages in a tablespace. More aesthetically
    pleasing plots can be produced with space-index-pages-summary output,
    but this is a quick and easy way to produce a passable plot. A starting
    page number can be provided with the --page/-p argument.

  space-page-type-regions
    Summarize all contiguous regions of the same page type. This is useful to
    provide an overall view of the space and allocations within it. A starting
    page number can be provided with the --page/-p argument.

  space-page-type-summary
    Summarize all pages by type. A starting page number can be provided with
    the --page/-p argument.

  space-indexes
    Summarize all indexes (actually each segment of the indexes) to show
    the number of pages used and allocated, and the segment fill factor.

  space-lists
    Print a summary of all lists in a space.

  space-list-iterate
    Iterate through the contents of a space list.

  space-extents
    Iterate through all extents, printing the extent descriptor bitmap.

  space-extents-illustrate
    Iterate through all extents, illustrating the extent usage using ANSI
    color and Unicode box drawing characters to show page usage throughout
    the space.

  space-extents-illustrate-svg
    Iterate through all extents, illustrating the extent usage in SVG format
    printed to stdout to show page usage throughout the space.

  space-lsn-age-illustrate
    Iterate through all pages, producing a heat map colored by the page LSN
    using ANSI color and Unicode box drawing characters, allowing the user to
    get an overview of page modification recency.

  space-lsn-age-illustrate-svg
    Iterate through all pages, producing a heat map colored by the page LSN
    producing SVG format output, allowing the user to get an overview of page
    modification recency.

  space-inodes-summary
    Iterate through all inodes, printing a short summary of each FSEG.

  space-inodes-detail
    Iterate through all inodes, printing a detailed report of each FSEG.

  index-recurse
    Recurse an index, starting at the root (which must be provided in the first
    --page/-p argument), printing the node pages, node pointers (links), leaf
    pages. A record describer must be provided with the --describer/-d argument
    to recurse indexes (in order to parse node pages).

  index-record-offsets
    Recurse an index as index-recurse does, but print the offsets of each
    record within the page.

  index-digraph
    Recurse an index as index-recurse does, but print a dot-compatible digraph
    instead of a human-readable summary.

  index-level-summary
    Print a summary of all pages at a given level (provided with the --level/-l
    argument) in an index.

  index-fseg-internal-lists
  index-fseg-leaf-lists
    Print a summary of all lists in an index file segment. Index root page must
    be provided with --page/-p.

  index-fseg-internal-list-iterate
  index-fseg-leaf-list-iterate
    Iterate the file segment list (whose name is provided in the first --list/-L
    argument) for internal or leaf pages for a given index (whose root page
    is provided in the first --page/-p argument). The lists used for each
    index are "full", "not_full", and "free".

  index-fseg-internal-frag-pages
  index-fseg-leaf-frag-pages
    Print a summary of all fragment pages in an index file segment. Index root
    page must be provided with --page/-p.

  page-dump
    Dump the contents of a page, using the Ruby pp ("pretty-print") module.

  page-account
    Account for a page‘s usage in FSEGs.

  page-validate
    Validate the contents of a page.

  page-directory-summary
    Summarize the record contents of the page directory in a page. If a record
    describer is available, the key of each record will be printed.

  page-records
    Summarize all records within a page.

  page-illustrate
    Produce an illustration of the contents of a page.

  record-dump
    Dump a detailed description of a record and the data it contains. A record
    offset must be provided with -R/--record.

  record-history
    Summarize the history (undo logs) for a record. A record offset must be
    provided with -R/--record.

  undo-history-summary
    Summarize all records in the history list (undo logs).

  undo-record-dump
    Dump a detailed description of an undo record and the data it contains.
    A record offset must be provided with -R/--record.

[[email protected]_L1 data]# 
  • ③、使用innodb_ruby进行测试

    创建测试表

CREATE TABLE `t2` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(50) DEFAULT NULL,
  `remark` varchar(50) DEFAULT NULL,
  `add_time` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `ix_t2_name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=3829 DEFAULT CHARSET=utf8

    插入测试数据(请移步到百度网盘)

    使用innodb_space来查看t2表的索引结构、数据分配情况

    space-indexes:Summarize all indexes (actually each segment of the indexes) to show the number of pages used and allocated, and the segment fill factor()

[[email protected]_L1 data]# innodb_space -s ibdata1 -T zeno3376/t2 space-indexes
id          name                            root        fseg        used        allocated   fill_factor
42          PRIMARY                         3           internal    1           1           100.00%
42          PRIMARY                         3           leaf        9           9           100.00%
43          ix_t2_name                      4           internal    1           1           100.00%
43          ix_t2_name                      4           leaf        4           4           100.00%   

    name:索引的名称;PRIMARY代表的就是聚集索引,因为InnoDB表是聚集所以组织表,行记录就是聚集索引;ix_t2_name就是辅助索引的名称

    root:索引中根节点的page number;可以看出聚集索引的根节点是第3个page(为什么是从第三个page开始,看下文space-page-type-regions),辅助索引的根节点是第4个page

    fseg:page的说明

    used:索引使用了多少个page;可以看出聚集索引的根几点使用了1个page,叶子节点使用了9个page;辅助索引ix_t2_name的叶子节点使用了4个page

    allocated: 索引分配了多少个page;可以看出聚集索引的根几点分配了1个page,叶子节点分配了9个page;辅助索引ix_t2_name的叶子节点分配了4个page

    fill_factor:索引的填充度;所有的填充度都是100%

    space-page-type-regions:Summarize all contiguous regions of the same page type. This is useful to  provide an overall view of the space and allocations within it. A starting  page number can be provided with the --page/-p argument.(统计每个类型的页共占用了多少页)

[[email protected]_L1 data]# innodb_space -s ibdata1 -T zeno3376/t2  space-page-type-regions
start       end         count       type
0           0           1           FSP_HDR
1           1           1           IBUF_BITMAP
2           2           1           INODE
3           17          15          INDEX
18          18          1           FREE (ALLOCATED)   

    start:从第几个page开始

    end:从第几个page结束

    count:占用了多少个page;

    type: page的类型

    从上面的结果可以看出:“FSP_HDR”、“IBUF_BITMAP”、“INODE”是分别占用了0,1,2号的page,从3号page开始才是存放数据和索引的页(Index),占用了3~17号的page,共15个page。

    接下来,根据得到的聚集索引和辅助索引的根节点来获取索引上的其他page的信息

    page-records:Summarize all records within a page.

# 解析表(聚集索引组织表,这里不需要加-I primary,否则会报错)
[[email protected]_L1 data]# innodb_space -s ibdata1 -T zeno3376/t2 -I primary -p 3 page-records
/usr/local/lib/ruby/gems/1.9.1/gems/innodb_ruby-0.9.15/lib/innodb/system.rb:213:in `index_by_name‘: undefined method `[]‘ for nil:NilClass (NoMethodError)
        from /usr/local/lib/ruby/gems/1.9.1/gems/innodb_ruby-0.9.15/bin/innodb_space:1913:in `<top (required)>‘
        from /usr/local/bin/innodb_space:23:in `load‘
        from /usr/local/bin/innodb_space:23:in `<main>‘

[[email protected]_L1 data]# innodb_space -s ibdata1 -T zeno3376/t2 -p 3 page-records
Record 126: (id=1782) → #5
Record 140: (id=1890) → #6
Record 154: (id=2101) → #7
Record 168: (id=2317) → #10
Record 182: (id=2531) → #11
Record 196: (id=2747) → #12
Record 210: (id=2964) → #15
Record 224: (id=3179) → #16
Record 238: (id=3394) → #17
# -p 3 就是解析3号page的意思
 

    上面的结果是解析聚集索引根节点页的信息,1行就代表使用了1个page,所以,叶子节点共使用了9个page,根节点使用了1个page,跟space_indexes的解析结果一致。

    Record 126: (id=1782) → #5

      id = 1782 代表的就是表中id为1782的记录,因为id是主键

    -> #5 代表的是指向5号page

    Record 126: (id=1782) → #5: 整行的意思就是5号page的id最小值是1782,包含了1782~1889的行记录。

    注意:page number并不是连续的

    根据解析root得到的信息,继续解析第一个叶子节点的信息

[[email protected]_L1 data]# innodb_space -s ibdata1 -T zeno3376/t2 -p 5 page-records
Record 128: (id=1782) → (name="zeno", remark="mysql", add_time=:NULL)

Record 162: (id=1783) → (name="KIK91QJET1FCZ46EJKML", remark="H4HJO5F7W5GSSDORT8AAT", add_time="184524556-49-63 92:14:08")

Record 233: (id=1784) → (name="XQZJ08164WSB2EI9M3HCWCEZZOXNB6", remark="8878ASA5AW", add_time="184524556-50-65 04:03:84")

Record 303: (id=1785) → (name="XAXK7RVVTYWEXB2ZFB", remark="TVZNZPW150ZNNJAC1", add_time="184524556-50-65 16:99:20")

Record 368: (id=1786) → (name="G0BZFYV26V14", remark="CYYVCNQJVDQ4OLO6YBZ", add_time="184524556-50-65 63:27:68")

…… # 注:已截断部分数据

Record 7187: (id=1885) → (name="XQ2E35QOX32I5GL0TH", remark="SZ4QTI116S3ISRZOJL0M", add_time="184524556-52-49 52:32:00")

Record 7255: (id=1886) → (name="S127FSHO2IPIE2", remark="2EX67306JBI7AL9Z", add_time="184524556-52-49 72:18:56")

Record 7315: (id=1887) → (name="2XKN9VXB5561923IPKVMBW", remark="6ZBU7PRXNDUHR4DV2PB", add_time="184524556-52-49 19:62:88")

Record 7386: (id=1888) → (name="42R60NM6IMTNHRB1L", remark="UG3GLX6ONU5", add_time="184524556-52-49 45:81:76")

Record 7444: (id=1889) → (name="0O2S6OCUC99MQKM1", remark="1K5GJEQ5QU83T3F", add_time="184524556-52-49 32:71:04")

    从上面可以看出,聚集索引的叶子节点是包含了行记录的所有数据。

    同理,解析辅助索引ix_t2_name,但是需要注意的是,在解析辅助索引是,需要加上“-I ix_t2_name”

[[email protected]_L1 data]# innodb_space -s ibdata1 -T zeno3376/t2 -I ix_t2_name -p 4 page-records
Record 127: (name="01EE2CCYUW35K0LVT5DAG2044NW") → #8
Record 196: (name="8WCS36CV56KGA8NE6OG23QFS") → #13
Record 169: (name="HQVX6ZX7H2XI") → #9
Record 235: (name="QXS8RUJF6FY") → #14

    从上面可以出,辅助索引ix_t2_name的key是name列,叶子节点共使用了4个page,加上根节点,那么辅助索引ix_t2_name共使用了5个page,跟使用space_indexes解析出来的结果一致。

    Record 127: (name="01EE2CCYUW35K0LVT5DAG2044NW") → #8 这条记录代表的意思是辅助索引的第1个叶子节点的page number是8,8号page的第一个key值是"01EE2CCYUW35K0LVT5DAG2044NW"

    Record 196: (name="8WCS36CV56KGA8NE6OG23QFS") → #13 这条记录代表的意思是辅助索引的第2个叶子节点的page number是13,13号page的第一个key值是"8WCS36CV56KGA8NE6OG23QFS"

    其它的记录如此类推……

    接下来看看辅助索引的叶子节点的结构

[[email protected]_L1 data]# innodb_space -s ibdata1 -T zeno3376/t2 -I ix_t2_name -p 8 page-records
Record 127: (name="01EE2CCYUW35K0LVT5DAG2044NW") → (id=1855)

Record 165: (name="02RFY8SJLQ879F2CYHI") → (id=2132)

Record 10829: (name="04FNKNM16R7U27A3") → (id=3152)

Record 195: (name="06WM2Q51B0D8L76VM2") → (id=2184)

Record 224: (name="0739V9NMP4") → (id=1843)

……# 注:已截断部分信息

Record 8197: (name="8U4049BA2TAAY7A89SDG") → (id=2003)

Record 10591: (name="8UQOOOU7X5AYE75GU") → (id=3111)

Record 8228: (name="8V5C6OGK4NGAHE6") → (id=2247)

Record 12607: (name="8V6SFJ0P8E1XKIF005QD3NTCI") → (id=3435)

Record 9595: (name="8VK9HHEN3G") → (id=2972)

    从上面可以看到叶子节点中包含可辅助索引和主键列

    Record 127: (name="01EE2CCYUW35K0LVT5DAG2044NW") → (id=1855) 代表的意思就是name值为"01EE2CCYUW35K0LVT5DAG2044NW"的记录指向主键id=1855的行记录。

    其他的记录同理。

    以上,如有错谬,请不吝指出。

时间: 2024-10-11 17:30:56

使用innodb_ruby探查Innodb索引结构的相关文章

mysql索引结构原理、性能分析与优化

原文  http://wulijun.github.com/2012/08/21/mysql-index-implementation-and-optimization.html 第一部分:基础知识 索引 官方介绍索引是帮助MySQL高效获取数据的数据结构.笔者理解索引相当于一本书的目录,通过目录就知道要的资料在哪里, 不用一页一页查阅找出需要的资料. 唯一索引(unique index) 强调唯一,就是索引值必须唯一. 创建索引: create unique index 索引名 on 表名(列

由浅入深探究mysql索引结构原理、性能分析与优化

转载自:http://www.phpben.com/?post=74 第一部分:基础知识: 索引 官方介绍索引是帮助MySQL高效获取数据的数据结构.笔者理解索引相当于一本书的目录,通过目录就知道要的资料在哪里,不用一页一页查阅找出需要的资料.关键字index ------------------------------------------------------------- 唯一索引 强调唯一,就是索引值必须唯一,关键字unique index 创建索引: 1.create unique

[转]mysql索引结构原理、性能分析与优化

第一部分:基础知识 索引 官方介绍索引是帮助MySQL高效获取数据的数据结构.笔者理解索引相当于一本书的目录,通过目录就知道要的资料在哪里, 不用一页一页查阅找出需要的资料. 唯一索引(unique index) 强调唯一,就是索引值必须唯一. 创建索引: create unique index 索引名 on 表名(列名); alter table 表名 add unique index 索引名 (列名); 删除索引: drop index 索引名 on 表名; alter table 表名 d

转:由浅入深探究mysql索引结构原理、性能分析与优化

摘要: 第一部分:基础知识 第二部分:MYISAM和INNODB索引结构 1. 简单介绍B-tree B+ tree树 2. MyisAM索引结构 3. Annode索引结构 4. MyisAM索引与InnoDB索引相比较 第三部分:MYSQL优化 1.表数据类型选择 2.sql语句优化 (1)     最左前缀原则 (1.1)  能正确的利用索引 (1.2)  不能正确的利用索引 (1.3)  如果一个查询where子句中确实不需要password列,那就用“补洞”. (1.4)  like

由浅入深探究 MySQL索引结构原理、性能分析与优化

第一部分:基础知识: 索引 官方介绍索引是帮助MySQL高效获取数据的数据结构.笔者理解索引相当于一本书的目录,通过目录就知道要的资料在哪里,不用一页一页查阅找出需要的资料.关键字index --------------------- 唯一索引 强调唯一,就是索引值必须唯一,关键字unique index 创建索引: 1.create unique index 索引名 on 表名(列名); 2.alter table 表名 add unique index 索引名 (列名); 删除索引: 1.

由浅入深探究mysql索引结构原理、性能分析与优化(转)

add by zhj:原文链接已经失效了,网上看到的都是转载,向作者Benwin致敬 摘要: 第一部分:基础知识 第二部分:MYISAM和INNODB索引结构 1.简单介绍B-tree B+ tree树 2.MyisAM索引结构 3.Annode索引结构 4.MyisAM索引与InnoDB索引相比较 第三部分:MYSQL优化 1.表数据类型选择 2.sql语句优化 (1)     最左前缀原则 (1.1)  能正确的利用索引 (1.2)  不能正确的利用索引 (1.3)  如果一个查询where

【转】由浅入深探究mysql索引结构原理、性能分析与优化

摘要: 第一部分:基础知识 第二部分:MYISAM和INNODB索引结构 1.简单介绍B-tree B+ tree树 2.MyisAM索引结构 3.Annode索引结构 4.MyisAM索引与InnoDB索引相比较 第三部分:MYSQL优化 1.表数据类型选择 2.sql语句优化 (1)     最左前缀原则 (1.1)  能正确的利用索引 (1.2)  不能正确的利用索引 (1.3)  如果一个查询where子句中确实不需要password列,那就用“补洞”. (1.4)  like (2)

剖析Mysql的InnoDB索引

摘要: 本篇介绍下Mysql的InnoDB索引相关知识,从各种树到索引原理到存储的细节. InnoDB是Mysql的默认存储引擎(Mysql5.5.5之前是MyISAM,文档).本着高效学习的目的,本篇以介绍InnoDB为主.少量涉及MyISAM作为对照. 这篇文章是我在学习过程中总结完毕的.内容主要来自书本和博客(參考文献会给出).过程中增加了一些自己的理解.描写叙述不准确的地方烦请指出. 1 各种树形结构 本来不打算从二叉搜索树開始,由于网上已经有太多相关文章,可是考虑到清晰的图示对理解问题

数据库为什么要用B+树结构--MySQL索引结构的实现

B+树在数据库中的应用 { 为什么使用B+树?言简意赅,就是因为: 1.文件很大,不可能全部存储在内存中,故要存储到磁盘上 2.索引的结构组织要尽量减少查找过程中磁盘I/O的存取次数(为什么使用B-/+Tree,还跟磁盘存取原理有关.) 3.局部性原理与磁盘预读,预读的长度一般为页(page)的整倍数,(在许多操作系统中,页得大小通常为4k) 4.数据库系统巧妙利用了磁盘预读原理,将一个节点的大小设为等于一个页,这样每个节点只需要一次I/O就可以完全载入,(由于节点中有两个数组,所以地址连续).