WAVE PCM soundfile format

WAVE PCM soundfile format



The WAVE file format is a subset of Microsoft‘s RIFF specification for the storage of multimedia files. A RIFF file starts out with a file header followed by a sequence of data chunks. A WAVE file is often just a RIFF file with a single "WAVE" chunk which consists of two sub-chunks -- a "fmt " chunk specifying the data format and a "data" chunk containing the actual sample data. Call this form the "Canonical form". Who knows how it really all works. An almost complete description which seems totally useless unless you want to spend a week looking over it can be found at MSDN (mostly describes the non-PCM, or registered proprietary data formats).

I use the standard WAVE format as created by the sox program:

Offset  Size  Name             Description

The canonical WAVE format starts with the RIFF header:

0         4   ChunkID          Contains the letters "RIFF" in ASCII form
                               (0x52494646 big-endian form).
4         4   ChunkSize        36 + SubChunk2Size, or more precisely:
                               4 + (8 + SubChunk1Size) + (8 + SubChunk2Size)
                               This is the size of the rest of the chunk
                               following this number.  This is the size of the
                               entire file in bytes minus 8 bytes for the
                               two fields not included in this count:
                               ChunkID and ChunkSize.
8         4   Format           Contains the letters "WAVE"
                               (0x57415645 big-endian form).

The "WAVE" format consists of two subchunks: "fmt " and "data":
The "fmt " subchunk describes the sound data‘s format:

12        4   Subchunk1ID      Contains the letters "fmt "
                               (0x666d7420 big-endian form).
16        4   Subchunk1Size    16 for PCM.  This is the size of the
                               rest of the Subchunk which follows this number.
20        2   AudioFormat      PCM = 1 (i.e. Linear quantization)
                               Values other than 1 indicate some
                               form of compression.
22        2   NumChannels      Mono = 1, Stereo = 2, etc.
24        4   SampleRate       8000, 44100, etc.
28        4   ByteRate         == SampleRate * NumChannels * BitsPerSample/8
32        2   BlockAlign       == NumChannels * BitsPerSample/8
                               The number of bytes for one sample including
                               all channels. I wonder what happens when
                               this number isn‘t an integer?
34        2   BitsPerSample    8 bits = 8, 16 bits = 16, etc.
          2   ExtraParamSize   if PCM, then doesn‘t exist
          X   ExtraParams      space for extra parameters

The "data" subchunk contains the size of the data and the actual sound:

36        4   Subchunk2ID      Contains the letters "data"
                               (0x64617461 big-endian form).
40        4   Subchunk2Size    == NumSamples * NumChannels * BitsPerSample/8
                               This is the number of bytes in the data.
                               You can also think of this as the size
                               of the read of the subchunk following this
                               number.
44        *   Data             The actual sound data.

As an example, here are the opening 72 bytes of a WAVE file with bytes shown as hexadecimal numbers:

52 49 46 46 24 08 00 00 57 41 56 45 66 6d 74 20 10 00 00 00 01 00 02 00 
22 56 00 00 88 58 01 00 04 00 10 00 64 61 74 61 00 08 00 00 00 00 00 00 
24 17 1e f3 3c 13 3c 14 16 f9 18 f9 34 e7 23 a6 3c f2 24 f2 11 ce 1a 0d 

Here is the interpretation of these bytes as a WAVE soundfile: 


Notes:

  • The default byte ordering assumed for WAVE data files is little-endian. Files written using the big-endian byte ordering scheme have the identifier RIFX instead of RIFF.
  • The sample data must end on an even byte boundary. Whatever that means.
  • 8-bit samples are stored as unsigned bytes, ranging from 0 to 255. 16-bit samples are stored as 2‘s-complement signed integers, ranging from -32768 to 32767.
  • There may be additional subchunks in a Wave data stream. If so, each will have a char[4] SubChunkID, and unsigned long SubChunkSize, and SubChunkSize amount of data.
  • RIFF stands for Resource Interchange File Format.

General discussion of RIFF files:

Multimedia applications require the storage and management of a wide variety of data, including bitmaps, audio data, video data, and peripheral device control information. RIFF provides a way to store all these varied types of data. The type of data a RIFF file contains is indicated by the file extension. Examples of data that may be stored in RIFF files are:

  • Audio/visual interleaved data (.AVI)
  • Waveform data (.WAV)
  • Bitmapped data (.RDI)
  • MIDI information (.RMI)
  • Color palette (.PAL)
  • Multimedia movie (.RMN)
  • Animated cursor (.ANI)
  • A bundle of other RIFF files (.BND)

NOTE: At this point, AVI files are the only type of RIFF files that have been fully implemented using the current RIFF specification. Although WAV files have been implemented, these files are very simple, and their developers typically use an older specification in constructing them.

For more info see http://www.ora.com/centers/gff/formats/micriff/index.htm


References:

  1. http://www.ora.com/centers/gff/formats/micriff/index.htm (good).
  2. http://premium.microsoft.com/msdn/library/tools/dnmult/d1/newwave.htm
  3. http://www.lightlink.com/tjweber/StripWav/WAVE.html
时间: 2024-10-04 01:35:12

WAVE PCM soundfile format的相关文章

分享python分析wave, pcm音频文件

最近研究的,我用的是python3.3, 用matplotlib画图, 下面代码演示分析pcm文件,如果是wave文件,把wave的文件头去掉就是pcm文件了. 代码如下 1 # -*- coding:utf-8 -*- 2 3 import array 4 import os 5 from matplotlib import pyplot 6 7 fileName = 'e:/music/qianqian.pcm' # 2 channel, 16 bit per sample 8 file =

(转)WAVE PCM 声音文件格式

WAVE文件格式是Microsoft为存储多媒体的RIFF规范的一部分.一个RIFF文件以一个文件头开始,然后是一系列的数据块.一个WAVE文件常常仅由一个WAVE块构成,WAVE块包含一个说明格式的fmt块和存储取样信息的数据块. 标准WAVE文件格式 偏移 长度(字节) 变量名 描述 备注 0 4 ChunkID ASCII字符“RIFF”,大端形式 RIFF头 4 4 ChunkSize 36+SubChunk2Size,即:4 + (8 + SubChunk1Size) + (8 + S

转wave 文件解析

转 1 WAVE 文件格式分析 WAVE 文件作为多媒体中使用的声音波形文件格式之一,它是以RIFF(Resource Interchange File Format)格式为标准的.每个WAVE文件的头四个字节便是"RIFF".WAVE 文件由文件头和数据体两大部分组成.其中文件头又分为 RIFF/WAV 文件标识段和声音数据格式说明段两部分.WAVE文件各部分内容及格式见后文. 常见的声音文件主要有两种,分别对应于单声道(11.025KHz 采样率.8Bit 的采样值)和双声道(44

Android音频开发(4):如何存储和解析wav文件

无论是文字.图像还是声音,都必须以一定的格式来组织和存储起来,这样播放器才知道以怎样的方式去解析这一段数据,例如,对于原始的图像数据,我们常见的格式有 YUV.Bitmap,而对于音频来说,最简单常见的格式就是 wav 格式了. wav 格式,与 bitmap 一样,都是微软开发的一种文件格式规范,它们都有一个相似之处,就是整个文件分为两部分,第一部分是"文件头",记录重要的参数信息,对于音频而言,就包括:采样率.通道数.位宽等等,对于图像而言,就包括:图像的宽高.色彩位数等等:第二部

Python学习笔记--音频处理

Python 打开wav文件的操作 wav文件 利用python打开一个wav音频文件,然后分析wav文件的数据存储格式,有了格式之后就能很方便的进行一些信号处理的操作.Wikipedia给出的wav文件的资料如下  Waveform Audio File Format (WAVE, or more commonly known as WAV due to its filename extension - both pronounced "wave"')(rarely, Audio f

XAudio2播放PCM

XAudio2 是一个跨平台的API,在Xbox 360及Windows中得到支持.在Xbox 360上, XAudio2作为一个静态库编译到游戏可执行文件中.在Windows上,XAudio2提供一个动态链接库(DLL).以下例子只使用了其中的一部分功能,并不全面.详情请看微软技术页的XAudio2编程相关(英文). 使用XAudio2来播放未压缩的PCM音频数据的过程并不复杂,主要有以下几个步骤: 1. 建立XAudio2 引擎 使用XAudio2Create函数,该函数的功能是创建一个XA

处理音频--pyaudio

前言 安装 读写音频文件 play record wired playcallback Wirecallback 外部应用 总结 前言 很久之前写过一个将文本转成语音的,借助了一个名为pyttsx的库.具体使用可以参考下面的链接. http://blog.csdn.net/marksinoberg/article/details/52137547 今天再来分享一个处理音频的博文.接住百度的语音接口,差不多可以方便的将音频转成文字了. 安装 安装的过程比较麻烦一点,不是说安装的步骤,而是找到能用的

音频文件解析(一):WAV格式文件头部解析

WAV为微软公司(Microsoft)开发的一种声音文件格式,它符合RIFF(Resource Interchange File Format)文件规范,用于保存Windows平台的音频信息资源. 1.RIFF块(RIFF-Chunk) 偏移地址 字节数 数据类型 内容 &H00 4 String 'RIFF'文件标志 &H04 4 UInteger 文件总长 &H08  4  String  'WAVE'文件标志 2.格式化块(Format-Chunk) 偏移地址 字节数 数据类

Peach+Fuzzer

目录 1 Peach是什么.................................................................................................................. 4 1.1 Peach的历史........................................................................................................ 4 2