[工作积累] 32bit to 64bit: array index underflow

先贴一段C++标准(ISO/IEC 14882:2003):

5.2.1 Subscripting:

1 A postfix expression followed by an expression in square brackets is a postfix expression. One of the
expressions shall have the type “pointer to T” and the other shall have enumeration or integral type. The
result is an lvalue of type “T.” The type “T” shall be a completely-defined object type.56) The expression
E1[E2] is identical (by definition) to *((E1)+(E2)). [Note: see 5.3 and 5.7 for details of * and + and
8.3.4 for details of arrays. ]

5.7 Additive operators:

5 When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integral expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i–n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

最近工作中遇到的问题是在64位IOS上:

1 uint32 i;
2 ...
3 char* n = (char*)&array[i-1] + n;

当i为0的时候, 下标溢出.

对照标准, 可以看出, array[i-1]相当于*(array+uint32(i-1)), 当i-1计算溢出时, 下标的数值很大(0xFFFFFFFFU), 数组的寻址也溢出了, 结果是undefined behavior.

然而在32位上的结果正确, 为什么呢?

因为0xFFFFFFFF用来寻址的时候, 对于大多数32位CPU来说, 都是带符号的数值, 所以0xFFFFFFFF被CPU解释为-1.

这个时候array[0xFFFFFFFF]相当与array[-1].

然而对于C++来说, 这只是一个巧合. 根据标准定义, 他是undefined behavior. 所以在64位上出错也不奇怪.

对于64位, 当i为0时, 下标计算结果仍然是0xFFFFFFFF, 然而64位上的-1是0xFFFFFFFFFFFFFFFF, 显然0xFFFFFFFF是一个很大的正整数, 导致真正溢出.

解决办法是它转换为带符号数, 即: array[ int(i-1) ]. 这样的结果是-1, 而不是0xFFFFFFFFU.

回想blade的下标索引, 基本上都用的是size_t:

1 typedef size_t index_t;
2 index_t i;

因为size_t在64位系统上是64字节, 所以i-1得到的结果是0xFFFFFFFFFFFFFFFF, 这样也能保证基本上不出错.

然而这么做仍然是undefined behavior, 因为0xFFFFFFFFFFFFFFFF在用作寻址的时候, 是作为有符号数还是无符号数, 这个是由CPU来定义的, 虽然大部分CPU为了灵活都是用的是带符号的.

但是这是CPU的细节. 假如现在有一个公司, 如BMD或者UNTEL公司生产了新型的CPU, 其对与类似

mov eax, [ebx + esi*4]
lea eax, [ebx + esi]

的指令, 把索引寄存器(例如esi)解释为无符号数, 而在地址溢出时做了处理, 比如CPU异常信号或者地址回卷等等, 那么结果又很难说了.

这已经设计到CPU的细节, 超出了C++可控的范围, 所以C++标准才认为这是undefined behavior.

最好的方式是用带符号的类型, 比如int/long/long long/ptrdiff_t来作为下标, 当然有时候不能控制下标变量i的类型(来源), 那么在使用时注意转换.

或者, 将&array[i-1]改为 array + i - 1, 注意不是array + (i-1)

这样能够避免i为0时的溢出.

时间： 2024-12-07 04:13:14

[工作积累] 32bit to 64bit: array index underflow的相关文章

工作积累之NDK编译STL (zhuan)

方法: 1.在jni目录下新建Application.mk; 加入 APP_STL := stlport_static 右边的值还可以换成下面几个: system - 使用默认最小的C++运行库,这样生成的应用体积小,内存占用小,但部分功能将无法支持 stlport_static - 使用STLport作为静态库,这项是Android开发网极力推荐的 stlport_shared - STLport 作为动态库,这个可能产生兼容性和部分低版本的Android固件,目前不推荐使用. gnust

工作积累（五）——使用[email protected]注解实现常量功能

之前的博客中提到过如何通过 java.util.ResourceBundle 和 java.util.Properties类通过读取 key-value 文件的形式实现常量功能.其实 spring 已经通过@Value 注解实现,下面看看如何使用. 1.创建.properties文件: 在如下目录创建 keyvalue.properties文件src/main/resources/META-INF/spring/keyvalue.properties ,写入如下内容: test.value=il

[工作积累] NDK通过Java获取package name 和version

////////////////////////////////////////////////////////////////////////// //Java code snippet //get APK's versionCode in AndroidManifest.xml public int getVersionCode() { int versionCode = 1; try{ PackageInfo packageInfo = this.getPackageManager().g

HAL层Camera模块Dump图片--工作积累

导出YUN数据进行调试,分析问题: 1 // dump图像数据事列 2 void dump(const int width, const int height, void *yBuf, void *uvBuf) 3 { 4 char buf[256] = {'\0'}; 5 FILE* file_fd = fopen(buf, "wb"); 6 snprintf(buf, sizeof(buf), "/data/Effect/%dx%dvideodenoiser%d.yuv

negative array index read

Memory - illegal accesses:negative array index read 负的数组索引读取 This is only valid if arr is a pointer that points to the second element in an array or a later element. Otherwise, it is not valid, because you would be accessing memory outside the bounds

[工作积累] Google/Amazon平台的各种坑

所谓坑, 就是文档中没有标明的特别需要处理的细节, 工作中会被无故的卡住各种令人恼火的问题. 包括系统级的bug和没有文档化的限制. 继Android的各种坑后, 现在做Amazon平台, 遇到的坑很多, 这里记录一下备忘: 先汇总下Android Native下的各种问题, 当然有些限制有明确文档说明,不算坑,但是限制太多还是很不爽: android平台下的某些限制: android下的各种坑 (我的C/C++/汇编/计算机原理博客) OBB的各种bug: OBB的解决方案 arm gcc t

[转载]Memory Limits for 32-bit and 64-bit processes

During our recent blog chat, there were a number of topics that were asked about and I am going to expand on some of them. The first one is the memory limits for different processes. This really depends on a few different things. The architecture o

MySQL-python 1.2.3 for Windows and Python 2.7, 32bit and 64bit versions -(亲测可用)

MySQL-python 1.2.3 for Windows and Python 2.7, 32bit and 64bit versions - See more at: http://www.codegood.com/archives/129#sthash.dc3d3aib.dpuf http://www.codegood.com/archives/129 http://www.codegood.com/archives/4 MySQL-python 1.2.3 for Windows an

Linux Kernel sys_call_table、Kernel Symbols Export Table Generation Principle、Difference Between System Calls Entrance In 32bit、64bit Linux(undone)

目录 1. sys_call_table:系统调用表 2. 内核符号导出表.kallsyms_lookup_name 3. Linux 32bit.64bit下系统调用入口的异同 1. sys_call_table:系统调用表 Relevant Link: 2. 内核符号导出表.kallsyms_lookup_name Relevant Link: 3. Linux 32bit.64bit下系统调用入口的异同以sys_execve.sys_socketcall.sys_init_module这