http://blog.csdn.net/pirateleo/article/details/7061452
一、基本概念
1、 文件,由许多Box和FullBox组成。
2、 Box,每个Box由Header和Data组成。
3、 FullBox,是Box的扩展,Box结构的基础上在Header中增加8bits version和24bits flags。
4、 Header,包含了整个Box的长度size和类型type。当size==0时,代表这是文件中最后一个Box;当size==1时,意
味着Box长度需要更多bits来描述,在后面会定义一个64bits的largesize描述Box的长度;当type是uuid时,代表Box中的数
据是用户自定义扩展类型。
5、 Data,是Box的实际数据,可以是纯数据也可以是更多的子Boxes。
6、 当一个Box的Data中是一系列子Box时,这个Box又可成为Container Box。
结构如下图:
文件基本结构描述图
static int mov_read_ftyp(MOVContext *c, AVIOContext *pb, MOVAtom
atom)
{
uint32_t minor_ver;
int
comp_brand_size;
char minor_ver_str[11]; /* 32 bit integer
-> 10 digits + null */
char*
comp_brands_str;
uint8_t type[5] =
{0};
avio_read(pb, type, 4);
if
(strcmp(type, "qt "))
c->isom = 1;
av_log(c->fc, AV_LOG_DEBUG, "ISO: File
Type Major Brand: %.4s\n",(char *)&type);
av_dict_set(&c->fc->metadata, "major_brand", type,
0);
minor_ver = avio_rb32(pb); /* minor version
*/
snprintf(minor_ver_str, sizeof(minor_ver_str),
"%"PRIu32"", minor_ver);
av_dict_set(&c->fc->metadata, "minor_version",
minor_ver_str, 0);
comp_brand_size = atom.size -
8;
if (comp_brand_size <
0)
return
AVERROR_INVALIDDATA;
comp_brands_str =
av_malloc(comp_brand_size + 1); /* Add null terminator */
if (!comp_brands_str)
return
AVERROR(ENOMEM);
avio_read(pb, comp_brands_str,
comp_brand_size);
comp_brands_str[comp_brand_size] =
0;
av_dict_set(&c->fc->metadata,
"compatible_brands", comp_brands_str, 0);
av_freep(&comp_brands_str);
return 0;
}
二、MP4文件格式(ISO-14496-12/14)
MP4文件概述
MP4文件就是由各式各样的Box组成的,下表中列出了所有必选或可选的Box类型,√代表Box必选。
具体列表:
ftyp |
√ |
file type and compatibility | |||||
pdin |
progressive download information | ||||||
moov |
√ |
container for all the metadata | |||||
mvhd |
√ |
movie header, overall declarations | |||||
trak |
√ |
container for an individual track or stream | |||||
tkhd |
√ |
track header, overall information about the | |||||
tref |
track reference container | ||||||
edts |
edit list container | ||||||
elst |
an edit list | ||||||
mdia |
√ |
container for the media information in a track | |||||
mdhd |
√ |
media header, overall information about the | |||||
hdlr |
√ |
handler, declares the media (handler) type | |||||
minf |
√ |
media information container | |||||
vmhd |
video media header, overall information (video track | ||||||
smhd |
sound media header, overall information (sound track | ||||||
hmhd |
hint media header, overall information (hint track | ||||||
nmhd |
Null media header, overall information (some tracks | ||||||
dinf |
√ |
data information box, container | |||||
dref |
√ |
data reference box, declares source(s) of media data in | |||||
stbl |
√ |
sample table box, container for the time/space | |||||
stsd |
√ |
sample descriptions (codec types, initialization etc.) | |||||
stts |
√ |
(decoding) time-to-sample | |||||
ctts |
(composition) time to sample | ||||||
stsc |
√ |
sample-to-chunk, partial data-offset information | |||||
stsz |
sample sizes (framing) | ||||||
stz2 |
compact sample sizes (framing) | ||||||
stco |
√ |
chunk offset, partial data-offset information | |||||
co64 |
64-bit chunk offset | ||||||
stss |
sync sample table (random access points) | ||||||
stsh |
shadow sync sample table | ||||||
padb |
sample padding bits | ||||||
stdp |
sample degradation priority | ||||||
sdtp |
independent and disposable samples | ||||||
sbgp |
sample-to-group | ||||||
sgpd |
sample group description | ||||||
subs |
sub-sample information | ||||||
mvex |
movie extends box | ||||||
mehd |
movie extends header box | ||||||
trex |
√ |
track extends defaults | |||||
ipmc |
IPMP Control Box | ||||||
moof |
movie fragment | ||||||
mfhd |
√ |
movie fragment header | |||||
traf |
track fragment | ||||||
tfhd |
√ |
track fragment header | |||||
trun |
track fragment run | ||||||
sdtp |
independent and disposable samples | ||||||
sbgp |
sample-to-group | ||||||
subs |
sub-sample information | ||||||
mfra |
movie fragment random access | ||||||
tfra |
track fragment random access | ||||||
mfro |
√ |
movie fragment random access offset | |||||
mdat |
media data container | ||||||
free |
free space | ||||||
skip |
free space | ||||||
udta |
user-data | ||||||
cprt |
copyright etc. | ||||||
meta |
metadata | ||||||
hdlr |
√ |
handler, declares the metadata (handler) type | |||||
dinf |
data information box, container | ||||||
dref |
data reference box, declares source(s) of metadata items | ||||||
ipmc |
IPMP Control Box | ||||||
iloc |
item location | ||||||
ipro |
item protection | ||||||
sinf |
protection scheme information box | ||||||
frma |
original format box | ||||||
imif |
IPMP Information box | ||||||
schm |
scheme type box | ||||||
schi |
scheme information box | ||||||
iinf |
item information | ||||||
xml |
XML container | ||||||
bxml |
binary XML container | ||||||
pitm |
primary item reference | ||||||
fiin |
file delivery item information | ||||||
paen |
partition entry | ||||||
fpar |
file partition | ||||||
fecr |
FEC reservoir | ||||||
segr |
file delivery session group | ||||||
gitn |
group id to name | ||||||
tsel |
track selection | ||||||
meco |
additional metadata container | ||||||
mere |
metabox relation |
正式开始前先对文件的几个重要部分宏观介绍一下,以便诸位在后续学习时心中有数:
1、 ftypbox,在文件的开始位置,描述的文件的版本、兼容协议等;
2、 moovbox,这个box中不包含具体媒体数据,但包含本文件中所有媒体数据的宏观描述信息,moov
box下有mvhd和trak box。
>>mvhd中记录了创建时间、修改时间、时间度量标尺、可播放时长等信息。
>>trak中的一系列子box描述了每个媒体轨道的具体信息。
3、 moofbox,这个box是视频分片的描述信息。并不是MP4文件必须的部分,但在我们常见的可在线播放的MP4格式文件中(例如Silverlight
Smooth Streaming中的ismv文件)确是重中之重。
4、 mdatbox,实际媒体数据。我们最终解码播放的数据都在这里面。
5、 mfrabox,一般在文件末尾,媒体的索引文件,可通过查询直接定位所需时间点的媒体数据。
附:Smooth
Streaming中ismv文件结构,文件分为了多个Fragments,每个Fragment中包含moof和mdat。这样的结构符合渐进式播放需求。(mdat及其描述信息逐步传输,收齐一个Fragment便可播放其中的mdat)。
http://blog.csdn.net/tx3344/article/details/8476669
MP4(MPEG-4 Part 14)是一种常见的多媒体容器格式,它是在“ISO/IEC
14496-14”标准文件中定义的。
1.最小组成单元 BOX
像FLV的tag、MKV的EBML、ASF文件中的 ASF object.mp4
是由一系列的box组成,他的最小组成单元就是box.
size;指明了整个box所占用的大小,包括header部分.
type;表示这个box的类型。(附表1)
largesize;如果box很大超过了uint32的最大数值,size就被设置为1,并用接下来的 largesize来存放大小。
2.mp4文件整体结构
mp4文件说白了就是一系列box组成,大box里面有小box。
接下来会深入到具体的box里面,来具体分析mp4格式
未完待续.....
附表1
Code | Abstract | Defined in/by |
ainf | Asset information to identify, license and play | DECE |
albm | Album title and track number (user-data) | 3GPP |
auth | Media author name (user-data) | 3GPP |
avcn | AVC NAL Unit Storage Box | DECE |
bloc | Base location and purchase location for license acquisition | DECE |
bpcc | Bits per component | JP2 |
buff | Buffering information | AVC |
bxml | binary XML container | ISO |
ccid | OMA DRM Content ID | OMA DRM 2.1 |
cdef | type and ordering of the components within the codestream | JP2 |
clsf | Media classification (user-data) | 3GPP |
cmap | mapping between a palette and codestream components | JP2 |
co64 | 64-bit chunk offset | ISO |
colr | specifies the colourspace of the image | JP2 |
cprt | copyright etc. (user-data) | ISO |
crhd | reserved for ClockReferenceStream header | MP4V1 |
cslg | composition to decode timeline mapping | ISO |
ctts | (composition) time to sample | ISO |
cvru | OMA DRM Cover URI | OMA DRM 2.1 |
dcfD | Marlin DCF Duration, user-data atom type | OMArlin |
dinf | data information box, container | ISO |
dref | data reference box, declares source(s) of media data in track | ISO |
dscp | Media description (user-data) | 3GPP |
dsgd | DVB Sample Group Description Box | DVB |
dstg | DVB Sample to Group Box | DVB |
edts | edit list container | ISO |
elst | an edit list | ISO |
feci | FEC Informatiom | ISO |
fecr | FEC Reservoir | ISO |
fiin | FD Item Information | ISO |
fire | File Reservoir | ISO |
fpar | File Partition | ISO |
free | free space | ISO |
frma | original format box | ISO |
ftyp | file type and compatibility | JP2, ISO |
gitn | Group ID to name | ISO |
gnre | Media genre (user-data) | 3GPP |
grpi | OMA DRM Group ID | OMA DRM 2.0 |
hdlr | handler, declares the media (handler) type | ISO |
hmhd | hint media header, overall information (hint track only) | ISO |
hpix | Hipix Rich Picture (user-data or meta-data) | HIPIX |
icnu | OMA DRM Icon URI | OMA DRM 2.0 |
ID32 | ID3 version 2 container | inline |
idat | Item data | ISO |
ihdr | Image Header | JP2 |
iinf | item information | ISO |
iloc | item location | ISO |
imif | IPMP Information box | ISO |
infu | OMA DRM Info URL | OMA DRM 2.0 |
iods | Object Descriptor container box | MP4V1 |
iphd | reserved for IPMP Stream header | MP4V1 |
ipmc | IPMP Control Box | ISO |
ipro | item protection | ISO |
iref | Item reference | ISO |
jP$20$20 | JPEG 2000 Signature | JP2 |
jp2c | JPEG 2000 contiguous codestream | JP2 |
jp2h | Header | JP2 |
jp2i | intellectual property information | JP2 |
kywd | Media keywords (user-data) | 3GPP |
loci | Media location information (user-data) | 3GPP |
lrcu | OMA DRM Lyrics URI | OMA DRM 2.1 |
m7hd | reserved for MPEG7Stream header | MP4V1 |
mdat | media data container | ISO |
mdhd | media header, overall information about the media | ISO |
mdia | container for the media information in a track | ISO |
mdri | Mutable DRM information | OMA DRM 2.0 |
meco | additional metadata container | ISO |
mehd | movie extends header box | ISO |
mere | metabox relation | ISO |
meta | Metadata container | ISO |
mfhd | movie fragment header | ISO |
mfra | Movie fragment random access | ISO |
mfro | Movie fragment random access offset | ISO |
minf | media information container | ISO |
mjhd | reserved for MPEG-J Stream header | MP4V1 |
moof | movie fragment | ISO |
moov | container for all the meta-data | ISO |
mvcg | Multiview group | AVC |
mvci | Multiview Information | AVC |
mvex | movie extends box | ISO |
mvhd | movie header, overall declarations | ISO |
mvra | Multiview Relation Attribute | AVC |
nmhd | Null media header, overall information (some tracks only) | ISO |
ochd | reserved for ObjectContentInfoStream header | MP4V1 |
odaf | OMA DRM Access Unit Format | OMA DRM 2.0 |
odda | OMA DRM Content Object | OMA DRM 2.0 |
odhd | reserved for ObjectDescriptorStream header | MP4V1 |
odhe | OMA DRM Discrete Media Headers | OMA DRM 2.0 |
odrb | OMA DRM Rights Object | OMA DRM 2.0 |
odrm | OMA DRM Container | OMA DRM 2.0 |
odtt | OMA DRM Transaction Tracking | OMA DRM 2.0 |
ohdr | OMA DRM Common headers | OMA DRM 2.0 |
padb | sample padding bits | ISO |
paen | Partition Entry | ISO |
pclr | palette which maps a single component in index space to a multiple- component image | JP2 |
pdin | Progressive download information | ISO |
perf | Media performer name (user-data) | 3GPP |
pitm | primary item reference | ISO |
res$20 | grid resolution | JP2 |
resc | grid resolution at which the image was captured | JP2 |
resd | default grid resolution at which the image should be displayed | JP2 |
rtng | Media rating (user-data) | 3GPP |
sbgp | Sample to Group box | AVC,ISO |
schi | scheme information box | ISO |
schm | scheme type box | ISO |
sdep | Sample dependency | AVC |
sdhd | reserved for SceneDescriptionStream header | MP4V1 |
sdtp | Independent and Disposable Samples Box | AVC,ISO |
sdvp | SD Profile Box | SDV |
segr | file delivery session group | ISO |
senc | Sample specific encryption data | DECE |
sgpd | Sample group definition box | AVC,ISO |
sidx | Segment Index Box | 3GPP |
sinf | protection scheme information box | ISO |
skip | free space | ISO |
smhd | sound media header, overall information (sound track only) | ISO |
srmb | System Renewability Message | DVB |
srmc | System Renewability Message container | DVB |
srpp | STRP Process | ISO |
stbl | sample table box, container for the time/space map | ISO |
stco | chunk offset, partial data-offset information | ISO |
stdp | sample degradation priority | ISO |
sthd | Subtitle Media Header Box | DECE |
stsc | sample-to-chunk, partial data-offset information | ISO |
stsd | sample descriptions (codec types, initialization etc.) | ISO |
stsh | shadow sync sample table | ISO |
stss | sync sample table (random access points) | ISO |
stsz | sample sizes (framing) | ISO |
stts | (decoding) time-to-sample | ISO |
styp | Segment Type Box | 3GPP |
stz2 | compact sample sizes (framing) | ISO |
subs | Sub-sample information | ISO |
swtc | Multiview Group Relation | AVC |
tfad | Track fragment adjustment box | 3GPP |
tfhd | Track fragment header | ISO |
tfma | Track fragment media adjustment box | 3GPP |
tfra | Track fragment radom access | ISO |
tibr | Tier Bit rate | AVC |
tiri | Tier Information | AVC |
titl | Media title (user-data) | 3GPP |
tkhd | Track header, overall information about the track | ISO |
traf | Track fragment | ISO |
trak | container for an individual track or stream | ISO |
tref | track reference container | ISO |
trex | track extends defaults | ISO |
trgr | Track grouping information | ISO |
trik | Facilitates random access and trick play modes | DECE |
trun | track fragment run | ISO |
tsel | Track selection (user-data) | 3GPP |
udta | user-data | ISO |
uinf | a tool by which a vendor may provide access to additional information associated with a UUID | JP2 |
UITS | Unique Identifier Technology Solution | Universal Music |
ulst | a list of UUID’s | JP2 |
url$20 | a URL | JP2 |
uuid | user-extension box | ISO, JP2 |
vmhd | video media header, overall information (video track only) | ISO |
vwdi | Multiview Scene Information | AVC |
xml$20 | a tool by which vendors can add XML formatted information | JP2 |
xml$20 | XML container | ISO |
yrrc | Year when media was recorded (user-data) | 3GPP |
QuickTime Codes
Code | Abstract | Defined in/by |
clip | Visual clipping region container | QT |
crgn | Visual clipping region definition | QT |
ctab | Track color-table | QT |
elng | Extended Language Tag | QT |
imap | Track input map definition | QT |
kmat | Compressed visual track matte | QT |
load | Track pre-load definitions | QT |
matt | Visual track matte for compositing | QT |
pnot | Preview container | QT |
wide | Expansion space reservation | QT |
1.File Type Box
Box Type: `ftyp’
这种box一般情况下都会出现在mp4文件的开头,它可以作为mp4容器格式的可表示信息。就像flv头‘F’ ‘L‘ ‘V‘ 3字节,MKV头部的1A 45 DF A3 、ASF_Header_Object
可以作为ASF容器格式的可辨识信息一样。
ftyp box内容结构如下
[cpp] view plaincopy
- aligned(8) class FileTypeBox
- extends Box(‘ftyp’) {
- unsigned int(32) major_brand;
- unsigned int(32) minor_version;
- unsigned int(32) compatible_brands[]; // to end of the box
- }
2.Movie Box
Box Type: ‘moov’
moov 这个box 里面包含了很多个子box,就像上篇那个图上标的。一般情况下moov 会紧跟着
ftyp。moov里面包含着mp4文件中的metedata。音视频相关的基础信息。让我们看看moov 里面都含有哪些重要的box。
2.1 Movie Header Box
Box Type: ‘mvhd’
mvhd 结果如下:
[cpp] view plaincopy
- aligned(8) class MovieHeaderBox extends FullBox(‘mvhd’, version, 0) {
- if (version==1) {
- unsigned int(64) creation_time;
- unsigned int(64) modification_time;
- unsigned int(32) timescale;
- unsigned int(64) duration;
- } else { // version==0
- unsigned int(32) creation_time;
- unsigned int(32) modification_time;
- unsigned int(32) timescale;
- unsigned int(32) duration;
- }
- template int(32) rate = 0x00010000; // typically 1.0
- template int(16) volume = 0x0100; // typically, full volume
- const bit(16) reserved = 0;
- const unsigned int(32)[2] reserved = 0;
- template int(32)[9] matrix =
- { 0x00010000,0,0,0,0x00010000,0,0,0,0x40000000 };
- // Unity matrix
- bit(32)[6] pre_defined = 0;
- unsigned int(32) next_track_ID;
- }
Field |
Type |
Comment |
box size |
4 |
box大小 |
box type |
4 |
box类型 |
version |
1 |
box版本,0或1,一般为0。 |
flags |
3 |
flags |
creation time |
4 |
创建时间(相对于UTC时间1904-01-01零点的秒数) |
modification time |
4 |
修改时间 |
time scale |
4 |
文件媒体在1秒时间内的刻度值,可以理解为1秒长度的时间单元数 一般情况下视频的 都是90000 |
duration |
4 |
该track的时间长度,用duration和time scale值可以计算track时长,比如audio 70.016,video track的time scale = 600, duration = |
rate |
4 |
推荐播放速率,高16位和低16位分别为小数点整数部分和小数部分,即[16.16] |
volume |
2 |
与rate类似,[8.8] 格式,1.0(0x0100)表示最大音量 |
reserved |
10 |
保留位 |
matrix |
36 |
视频变换矩阵 |
pre-defined |
24 | |
next track id |
4 |
下一个track使用的id号 |
所以通过解析这部分内容可以或者duration、rate等主要信息。举个例子:
上面的例子解析可知 time
scale = 90000,duration = 15051036(E5A91C)/ time scale = 167s.
2.2 Track Box
Box Type: ‘trak’
在moov 这个box中会含有若干个track
box.每个track都是相对独立。tarck box里面会包含很多别的box,有2个很关键 Track Header
Box、Media Box。下图是个普通的mp4文件。可以看到track
box的简单结构。
2.2.1 Track Header Box
Box Type: ‘tkhd’
[cpp] view plaincopy
- aligned(8) class TrackHeaderBox
- extends FullBox(‘tkhd’, version, flags){
- if (version==1) {
- unsigned int(64) creation_time;
- unsigned int(64) modification_time;
- unsigned int(32) track_ID;
- const unsigned int(32) reserved = 0;
- unsigned int(64) duration;
- } else { // version==0
- unsigned int(32) creation_time;
- unsigned int(32) modification_time;
- unsigned int(32) track_ID;
- const unsigned int(32) reserved = 0;
- unsigned int(32) duration;
- }
- const unsigned int(32)[2] reserved = 0;
- template int(16) layer = 0;
- template int(16) alternate_group = 0;
- template int(16) volume = {if track_is_audio 0x0100 else 0};
- const unsigned int(16) reserved = 0;
- template int(32)[9] matrix=
- { 0x00010000,0,0,0,0x00010000,0,0,0,0x40000000 };
- // unity matrix
- unsigned int(32) width;
- unsigned int(32) height;
- }
Field |
Type |
Comment |
box size |
4 |
box大小 |
box type |
4 |
box类型 |
version |
1 |
box版本,0或1,一般为0。 |
flags |
3 |
按位或操作结果值,预定义如下: |
track id |
4 |
id号,不能重复且不能为0 |
reserved |
4 |
保留位 |
duration |
4 |
track的时间长度 |
reserved |
8 |
保留位 |
layer |
2 |
视频层,默认为0,值小的在上层 |
alternate group |
2 |
track分组信息,默认为0表示该track未与其他track有群组关系 |
volume |
2 |
[8.8] 格式,如果为音频track,1.0(0x0100)表示最大音量;否则为0 |
reserved |
2 |
保留位 |
matrix |
36 |
视频变换矩阵 |
width |
4 |
宽 |
height |
4 |
高,均为 [16.16] |
多媒体封装格式详解---MP4,码迷,mamicode.com