Huffman 压缩和解压缩java实现

附上完整的代码

http://download.csdn.net/download/u010485034/7847447

Huffman编码原理这里就不说了，是。这里来讲讲利用Huffman编码来进行压缩和解压缩的详细实现吧。

本project使用java实现。

编码

1. 流程图

2. 数据结构

CharacterWeight：记录字符值，以及其在待压缩文件里的权重。

Class{
char c; //字符值
int weight;  //在文件里权重
String code;  //其相应huffman编码
}

HuffmanNode：huffman树中的节点信息。

Class{
Int parent;  //父节点
Int lChild; //左儿子
Int rChild;  //右儿子
Int weight; //权重
}

3. 程序关键点

3.1 Huffman树的构建

Huffman树的变量：ArrayList<HuffmanNode> list。

创建流程图：

for(int i=0;i<list.size()-1;i++){
			//w1 : the first min weight w2: the second min weight
			//i1 : the first min weight index, i2:　the second min weight index
			int w1 = MAX_VALUE, w2=MAX_VALUE;
			int i1 = 0, i2 = 0;
			// find the two node with the minimum weight
			for(int j=0;j<tree.size();j++){
				HuffmanNode node = tree.get(j);
				if(node.getWeight()< w1 && node.getParent()==-1){
					w2 = w1;
					w1 = node.getWeight();
					i2 = i1;
					i1 = j;
				}
				else if(node.getWeight()<w2 && node.getParent()==-1){
					w2 = node.getWeight();
					i2 = j;
				}
			}
			//set the two node to be the children of a new node, and add the new node to the tree
			HuffmanNode pNode = new HuffmanNode(w1+w2);
			pNode.setlChild(i1);
			pNode.setrChild(i2);
			tree.add(pNode);
			tree.get(i1).setParent(tree.indexOf(pNode));
			tree.get(i2).setParent(tree.indexOf(pNode));}

3.2 依据Huffman 树获得Huffman编码

从叶子节点開始网上遍历Huffman树。直到到达根节点。依据当前节点为其父节点的左儿子还是右儿子确定这一位值是0还是1。

最后将依次获得的0,1字符串反转获得Huffman编码。

代码:

for(int i=0;i<list.size();i++){
			HuffmanNode node = tree.get(i);
			HuffmanNode pNode = tree.get(node.getParent());
			String code ="";
			while(true){
				if(pNode.getlChild()==tree.indexOf(node)){
					code = "0"+code;
				}
				else if(pNode.getrChild() == tree.indexOf(node)){
					code = "1"+code;
				}
				else {
					System.out.println("Tree Node Error!!!");
					return null;
				}
				node=pNode;
				if(node.getParent()!=-1)
					pNode=tree.get(node.getParent());
				else
					break;
			}
			list.get(i).setCode(new String(code));
		}

3.3 文件头设计

字符总数	Int 四个字节
字符种类数	Short 两个字节
叶子节点	char字符 short 父节点 3个字节
非叶子节点	Short 左儿子 short 右儿子 short父节点 6字节

文件头长度（单位： byte）

l= 9n

当中n 为字符种类数。

3.4文件内容的编码和写入

watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvdTAxMDQ4NTAzNA==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/SouthEast" >

	while((temp=reader.read())!=-1){ //!= EOF
				// get the code from the code table
				String code = codeTable.get((char)temp);
				c++;
				if(c>=count/96){
					System.out.print("=");
					c=0;
				}
				try{
					StringBuilder codeString = new StringBuilder(code);
					outputStringBuffer.append(codeString);
					while(outputStringBuffer.length()>8){
						out.write(Short.parseShort(outputStringBuffer.substring(0, 8),2));
						outputStringBuffer.delete(0, 8);
					}
				} catch(Exception e){
					e.printStackTrace();
				}

			}

解码

1. 流程图

watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvdTAxMDQ4NTAzNA==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/SouthEast" >

2. 数据结构

HuffmanNode：huffman树中的节点信息。

Class{

Int parent; //父节点

Int lChild; //左儿子

Int rChild; //右儿子

Int weight; //权重

Char c; //相应的字符值

3. 程序关键点

3.1 重建Huffman树。在文件头中存放的原本就是Huffman树的节点信息，所以重建Huffman树是比較简单的。

代码：

	in = new DataInputStream(new FileInputStream(file));
			count = in.readInt();
			charNum = in.readShort();
			nodeNum = 2*charNum -1;
			//rebuild the huffman tree
			for(int i=0;i<charNum;i++){
				HuffmanNode node = new HuffmanNode((char)in.readByte());
				int parent = in.readShort();
				node.setParent(parent);
				tree.add(node);
			}

			for(int i=charNum;i<nodeNum;i++){
				HuffmanNode node = new HuffmanNode(' ');
				int l = in.readShort();
				int r = in.readShort();
				int p = in.readShort();
				node.setlChild(l);
				node.setrChild(r);
				node.setParent(p);
				tree.add(node);
			}

3.2 解码

解码流程图

	while(true){
				while(buff.length()<32){
					temp = in.readInt();
					String codeString = Integer.toBinaryString(temp);
					while(codeString.length()<32){
						codeString='0'+codeString;
					}
					buff.append(codeString);
				}
				node = tree.get(tree.size()-1);
				dep = 0;
				while(!(node.getlChild()==-1&&node.getrChild()==-1)){
					if(dep>=buff.length()){
						System.out.println( "Buff overflow");
					}
					if(buff.charAt(dep)=='0'){
						node = tree.get(node.getlChild());
					}
					else if(buff.charAt(dep)=='1'){
						node = tree.get(node.getrChild());
					}
					else{
						System.out.println("Coding error");
					}
					dep++;
				}

				char c = node.getCH();
				num++;
				if(num>=n/99){
					System.out.print("=");
					num=0;
				}
				count++;
				if(count>=n){
					break;
				}
				charBuff+=c;
				if(charBuff.length()>256){
					writer.write(charBuff);
					charBuff="";
				}
				buff.delete(0, dep);

			}

		} catch(EOFException e){
			//just do nothing
		}
		catch(Exception e){
			e.printStackTrace();
		} finally{
			//there may be data released in the buff and charbuff, so we need to process them
			while(buff.length()>0){
				node = tree.get(tree.size()-1);
				dep = 0;
				while(!(node.getlChild()==-1&&node.getrChild()==-1)){
					if(dep>=buff.length()){
						break;
					}
					if(buff.charAt(dep)=='0'){
						node = tree.get(node.getlChild());
					}
					else if(buff.charAt(dep)=='1'){
						node = tree.get(node.getrChild());
					}
					else{
						System.out.println("Coding error");
						//return;
					}
					dep++;
				}
				char c = node.getCH();
				num++;
				if(num>=n/99){
					System.out.print("=");
					num=0;
				}
				count++;
				if(count>=n){
					break;
				}
				charBuff+=c;
				if(charBuff.length()>256){
					try {
						writer.write(charBuff);
					} catch (IOException e1) {
						// TODO Auto-generated catch block
						e1.printStackTrace();
					}
					charBuff="";
				}
				buff.delete(0, dep);
			}

			try {
				writer.write(charBuff);
				writer.close();
			} catch (IOException e) {
				// TODO Auto-generated catch block
				e.printStackTrace();
			}
		}
		try{
			writer.close();
		} catch(IOException e){
			throw e;
		}

完成project没有公布，稍后更新。

时间： 2024-10-21 22:29:04

Huffman 压缩和解压缩java实现的相关文章

Java对zip格式压缩和解压缩

Java对zip格式压缩和解压缩通过使用java的相关类可以实现对文件或文件夹的压缩,以及对压缩文件的解压. 1.1 ZIP和GZIP的区别 gzip是一种文件压缩工具(或该压缩工具产生的压缩文件格式),它的设计目标是处理单个的文件.gzip在压缩文件中的数据时使用的就是zlib.为了保存与文件属性有关的信息,gzip需要在压缩文件(*.gz)中保存更多的头信息内容,而zlib不用考虑这一点.但gzip只适用于单个文件,所以我们在UNIX/Linux上经常看到的压缩包后缀都是*.tar.gz或

java架构 [Java 基础] 使用java.util.zip包压缩和解压缩文件

Java API中的import java.util.zip.*;包下包含了Java对于压缩文件的所有相关操作. 我们可以使用该包中的方法,结合IO中的相关知识,进行文件的压缩和解压缩相关操作. ZipFile java中的每一个压缩文件都是可以使用ZipFile来进行表示的. File file = new File("F:/zippath.zip"); ZipFile zipFile = new ZipFile(file); System.out.println("压缩文

[Java 基础] 使用java.util.zip包压缩和解压缩文件

reference : http://www.open-open.com/lib/view/open1381641653833.html Java API中的import java.util.zip.*;包下包含了Java对于压缩文件的所有相关操作. 我们可以使用该包中的方法,结合IO中的相关知识,进行文件的压缩和解压缩相关操作. ZipFile java中的每一个压缩文件都是可以使用ZipFile来进行表示的. File file = new File("F:/zippath.zip&quo

JAVA实现多文件压缩和解压缩

[使用java进行压缩和解压缩] 亲测可用: import java.io.File; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.InputStream; import java.io.OutputStream; import java.util.Enumeration; import java.util.zip.ZipEntry; import java.util.zip.Zi

Java用ZIP格式压缩和解压缩文件

转载:java jdk实例宝典感觉讲的非常好就转载在这保存! java.util.zip包实现了Zip格式相关的类库,使用格式zip格式压缩和解压缩文件的时候,须要导入该包. 使用zipoutputstream能够实现文件压缩,全部写入到zipoutputstream输入流中的数据,都会被ZIP格式压缩. 每一个被压缩的文件或者文件夹在zip文件里都相应一个zipentry对象,每一个zipentry都有一个name属性,表示它相对于zip文件文件夹的相对路径,对于文件夹,路径以“/“结尾,对

java压缩和解压缩Zip、Jar、Gzip文件

我们经常会使用WinZIP等压缩软件将文件进行压缩以方便传输.在java里面也提供了将文件进行压缩以减少传输时的数据量的类,可以很方便的将文件压缩成ZIP.JAR.GZIP等形式,GZIP主要是在Linux系统下的压缩文件. 下面主要讲的就是ZIP形式的压缩文件,而JAR.GZIP形式的压缩文件也是类似的用法. ZIP是一种很常见的压缩形式,在java中要实现ZIP的压缩主要用到的是java.util.zip这个包里面的类.主要有ZipFile. ZipOutputStream.ZipInput

java工具类——java将一串数据按照gzip方式压缩和解压缩

我要整理在工作中用到的工具类分享出来,也方便自己以后查阅使用,这些工具类都是我自己实际工作中使用的 import java.io.ByteArrayInputStream; import java.io.ByteArrayOutputStream; import java.io.File; import java.io.IOException; import java.io.InputStream; import java.io.RandomAccessFile; import java.nio

使用commons-compress操作zip文件(压缩和解压缩)

http://www.cnblogs.com/luxh/archive/2012/06/28/2568758.html Apache Commons Compress是一个压缩.解压缩文件的类库. 可以操作ar, cpio, Unix dump, tar, zip, gzip, XZ, Pack200 and bzip2格式的文件,功能比较强大. 在这里写两个用Commons Compress把文件压缩成zip和从zip解压缩的方法. 直接贴上工具类代码: /** * Zip文件工具类 * @a

字符串的压缩和解压缩

http://www.blogjava.net/fastunit/archive/2008/04/25/195932.html 字符串的压缩和解压缩数据传输时,有时需要将数据压缩和解压缩,本例使用GZIPOutputStream/GZIPInputStream实现. 1.使用ISO-8859-1作为中介编码,可以保证准确还原数据2.字符编码确定时,可以在uncompress方法最后一句中显式指定编码 import java.io.ByteArrayInputStream;import java