谈谈zlib中crc32的跨平台问题

crc校验

结合学校的crc校验的基础知识，直接使用zlib/crc32函数应该一点门槛都没有。。下面是zlib.h中自带的示例和讲解：

     Update a running CRC-32 with the bytes buf[0..len-1] and return the
   updated CRC-32. If buf is NULL, this function returns the required initial
   value for the for the crc. Pre- and post-conditioning (one's complement) is
   performed within this function so it shouldn't be done by the application.
   Usage example:

     uLong crc = crc32(0L, Z_NULL, 0);

     while (read_buffer(buffer, length) != EOF) {
       crc = crc32(crc, buffer, length);
     }
     if (crc != original_crc) error();

crc校验支持两种方式：

原始buffer，传入crc32生成校验码，然后将校验码追加到buffer尾部发送或者写文件。读取时分别解析原始buffer和校验码，再次做crc32.，看结果与接收到的校验码是否一致，来判断buffer是否完好无损。
原始buffer，crc32做校验，然后将校验码取反追加到buffer尾部发送或者写文件。读取到整个内容（buffer+crcReverseCode)，做crc校验看结果是否是0xFFFFFFFF。

方式1操作起来有点呆板，我们一般采用方式2，既然比对目标是0xFFFFFFFF，那么说明crc32处理的一定是32位的数字。

我的疑问

今天再次看到这段代码，不由地有个疑问，crc32的32既然是32位的意思，为何要用uLong呢，（uLong的定义为typedef unsigned long uLong），uLong明显不具有跨平台性，具体参见：unsigned long的问题这尼玛不明摆着坑人么。带着这个疑问，于是我在64位linux上做了下面的测试：

// g++ crc_test.cpp -o exec_crc_test  -L /usr/lib64/ -lz

#include <stdio.h>
#include <stdlib.h>
#include <zlib.h>
#include <string.h>

void CRC_Partial_Check_Test();
template<class Type> void CRC_Full_Check_Test();

// crc32 use uLong, typedef unsigned long uLong;
typedef uLong ulong_t;	// 32bit, 64bit.

int main()
{
	CRC_Partial_Check_Test();
	CRC_Full_Check_Test<unsigned int>();
	CRC_Full_Check_Test<ulong_t>();

	printf("any key pressed to exit...\n");
	getchar();

	return 0;
}

typedef union
{
	ulong_t val;
	unsigned char buf[sizeof(ulong_t)];
} trans_un_t;

void CRC_Partial_Check_Test()
{
	printf("=============================>>CRC_Partial_ChecK_Test\n");

	char* buffer = "hello world, i am renyafei";
	int buffer_sz = strlen(buffer);

	{
		FILE* fp = fopen("crc.dat", "wb");
		fwrite(buffer, 1, buffer_sz, fp);
		ulong_t crc_code = crc32(0, (const Bytef*)buffer, buffer_sz);

		printf("crc_code : %lu\n", crc_code);

		fwrite(&crc_code, 1, sizeof(ulong_t), fp);

		fflush(fp);
		fclose(fp);
	}

	{
		FILE* fp = fopen("crc.dat", "rb");
		unsigned char content[1024] = {0};
		int read_bytes = fread(content, 1, buffer_sz+sizeof(ulong_t), fp);
		ulong_t crc_code = 0;
		{
			// get crc code
			unsigned char* pch = content + buffer_sz;

			trans_un_t trans;
			for(int k=0; k<sizeof(ulong_t); k++)
			{
				printf("%d ", pch[k]);
				trans.buf[k] = pch[k];
			} printf("\n");

			printf("crc_code : %lu\n", trans.val);
			crc_code = trans.val;
		}		

		if ( crc32(0, content, buffer_sz) != crc_code )
		{
			printf("ERROR content.\n");
		}
		else
		{
			printf("Good Content.\n");
		}

		fclose(fp);
	}
}

template<class Type>
void CRC_Full_Check_Test()
{
	printf("=============================>>CRC_Full_ChecK_Test\n");

	char* buffer = "hello world, i am renyafei";
	int buffer_sz = strlen(buffer);

	typedef Type dest_t;

	{
		FILE* fp = fopen("crc.dat", "wb");
		fwrite(buffer, 1, buffer_sz, fp);

		ulong_t crc = crc32(0, (const Bytef*)buffer, buffer_sz);
		dest_t rever_crc = (dest_t)~crc;

		printf("crc = %lu, reverse_crc_code : %lu\n", crc, rever_crc);

		fwrite(&rever_crc, 1, sizeof(dest_t), fp);
		fflush(fp);
		fclose(fp);
	}

	{
		FILE* fp = fopen("crc.dat", "rb");
		unsigned char content[1024] = {0};
		int read_bytes = fread(content, 1, buffer_sz+sizeof(dest_t), fp);
		{
			// get crc code
			unsigned char* pch = content + buffer_sz;

			trans_un_t trans;
			memcpy(trans.buf, pch, sizeof(ulong_t));

			printf("reverse_crc_code : %lu\n", trans.val);

		}

		ulong_t res = crc32(0, content, read_bytes);	printf("res = %lu\n", res);

		if ( res != 0xFFFFFFFF && res != 0xFFFFFFFFFFFFFFFF )
		{
			printf("ERROR content.\n");
		}
		else
		{
			printf("Good Content.\n");
		}

		fclose(fp);
	}
}

CRC_Partial_Check_Test为上文中的方式1，CRC_Full_Check_Test为方式2。

函数中第一块对一个字符串做校验，将字符串和校验码写入二进制文件，然后第二段从二进制文件中读取出字符串和校验码，进行crc校验判断字符串是否正确。

程序输出结果：

=============================>>CRC_Partial_ChecK_Test
crc_code : 4232166920
8 190 65 252 0 0 0 0
crc_code : 4232166920
Good Content.
=============================>>CRC_Full_ChecK_Test
crc = 4232166920, reverse_crc_code : 62800375
reverse_crc_code : 62800375
res = 4294967295
Good Content.
=============================>>CRC_Full_ChecK_Test
crc = 4232166920, reverse_crc_code : 18446744069477384695
reverse_crc_code : 18446744069477384695
res = 558161692
ERROR content.

CRC_Partial_Check_Test 即使使用uLong也没有问题。为毛？crc32校验码为unsigned int类型，写文件时提升成unsigned long 64位，即从低到高八个字节为：8 190 65 252 0 0 0 0，然后从文件读取，最后比较都是uLong类型，不影响结果。

CRC_Full_Check_Test<unsigned int>()就是我们正常使用的方式， CRC_Full_Check_Test<uLong>调用时crc校验失败，问题出在crcCode取反操作上，提升成uLong以后取反结果非常大。再次对整个buffer做crc校验失败~_~。

诡异的移位结果

在测试过程中发现了一个移位的坑：

ulong_t code = 0xfc << 24;
unsigned int ui_code = 0xfc << 24;
printf("ulong_res = %lu, uint_res = %lu\n", code, ui_code);

code = 0xfc << 8;
ui_code = 0xfc << 8;
printf("ulong_res = %lu, uint_res = %lu\n", code, ui_code);

输出结果如下：

ulong_res = 18446744073642442752, uint_res = 4227858432
ulong_res = 64512, uint_res = 64512

第一行uLong_res输出结果很奇怪。通过dump_val_hex函数输出一个数字的16进制形式：

template<class Type> void dump_val_hex_recur(Type val)
{
	if (val == 0)
	{
		return;
	}

	unsigned char byte = val & 0xFF;

	dump_val_hex_recur(val >> 8);

	printf("%02x ", byte);
}

template <class Type> void dump_val_hex(Type val) // 0xfc000000 => printf : fc 00 00 00
{
	if (val == 0)
	{
		printf("%02x\n", 0);
		return;
	}

	dump_val_hex_recur(val); printf("\n");
}

18446744073642442752对应的16进制形式为ff ff ff ff fc 00 00 00 。诡异的移位结果，居然高位被1填充。但是0xfc << 8 的结果确实高位被0填充。

时间： 2025-01-09 08:00:58

谈谈zlib中crc32的跨平台问题

crc校验

我的疑问

诡异的移位结果

谈谈zlib中crc32的跨平台问题的相关文章

谈谈python 中name = 'main' 的作用

【原】谈谈css中关于元素定位的属性（positon&z-index）

谈谈c++中继承中的虚函数

谈谈python中的遍历

用实例谈谈javascript中的this和prototype

谈谈java中静态变量与静态方法继承的问题

谈谈JS中的面向对象

谈谈java中成员变量与成员方法继承的问题

谈谈Vim中实用又好记的一些命令