上面的一篇粗略的介绍了一下python的对象结构,这篇来分析一个非常重要的部分,内存分配。。。
好像自己看的源代码,只要是跟C语言相关的,都在内存处理方面做了相当多的工作。。。。例如nginx,它也有实现自己的pool,python当然也不例外。。。。
python在内存分配上面分成了4个层次吧。。。
_____ ______ ______ ________
[ int ] [ dict ] [ list ] ... [ string ] Python core |
+3 | <----- Object-specific memory -----> | <-- Non-object memory --> |
_______________________________ | |
[ Python‘s object allocator ] | |
+2 | ####### Object memory ####### | <------ Internal buffers ------> |
______________________________________________________________ |
[ Python‘s raw memory allocator (PyMem_ API) ] |
+1 | <----- Python memory (under PyMem manager‘s control) ------> | |
__________________________________________________________________
[ Underlying general-purpose allocator (ex: C library malloc) ]
0 | <------ Virtual memory allocated for the python process -------> |
上面是直接从源代码里面copy出来的注释。。。。
(0)这个是最底层的C层面上的内存操作,也就是malloc和free了。。
(1)这个事python在C层面上的操作做了一层简单的封装。。例如PyMem_MALLOC,PyMem_FREE,它们是在malloc和free上面做了很简单的包装:
//这里对malloc,realloc与free做了简单的宏封装,对于malloc,如果为0,然么分配1 #define PyMem_MALLOC(n) ((size_t)(n) > (size_t)PY_SSIZE_T_MAX ? NULL : malloc((n) ? (n) : 1)) #define PyMem_REALLOC(p, n) ((size_t)(n) > (size_t)PY_SSIZE_T_MAX ? NULL : realloc((p), (n) ? (n) : 1)) #define PyMem_FREE free #endif /* PYMALLOC_DEBUG */ /* * Type-oriented memory interface * ============================== * * Allocate memory for n objects of the given type. Returns a new pointer * or NULL if the request was too large or memory allocation failed. Use * these macros rather than doing the multiplication yourself so that proper * overflow checking is always done. */ //通过感知type的大小来分内存 #define PyMem_New(type, n) ( ((size_t)(n) > PY_SSIZE_T_MAX / sizeof(type)) ? NULL : ( (type *) PyMem_Malloc((n) * sizeof(type)) ) ) #define PyMem_NEW(type, n) ( ((size_t)(n) > PY_SSIZE_T_MAX / sizeof(type)) ? NULL : ( (type *) PyMem_MALLOC((n) * sizeof(type)) ) )
上面还有New啥的,也就是扩展了类型大小的感知部分。。。。
(3)这一层就是最重点的部分了,PyObject_Malloc,PyObject_Free就属于这一层。。。。python就是在这一层实现了内存池,用于高效的进行内存分配。。。。(源代码集中在obmalloc.c里面)
几点python内存分配方面的常识:
(1)python的内存分配大体分为了两个部分,首先是小内存分配,这里主要是512字节以内,以及大于512字节的分配两类
(2)在内存分配方面按照8字节对齐的方式,例如要分配12字节的内存,其实最终将会占用16字节的大小
* Request in bytes Size of allocated block Size class idx
* ----------------------------------------------------------------
* 1-8 8 0
* 9-16 16 1
* 17-24 24 2
* 25-32 32 3
* 33-40 40 4
* 41-48 48 5
* 49-56 56 6
* 57-64 64 7
* 65-72 72 8
* ... ... ...
* 497-504 504 62
* 505-512 512 63
上面也是直接从源代码里面copy出来的注释,很鲜明的表现出了python内存分配方面的对齐策略。。。
接下来来看几个非常重要的宏定义:
#define ALIGNMENT 8 /* must be 2^N */ #define ALIGNMENT_SHIFT 3 #define ALIGNMENT_MASK (ALIGNMENT - 1) /* Return the number of bytes in size class I, as a uint. */ #define INDEX2SIZE(I) (((uint)(I) + 1) << ALIGNMENT_SHIFT) #define SMALL_REQUEST_THRESHOLD 512 #define NB_SMALL_SIZE_CLASSES (SMALL_REQUEST_THRESHOLD / ALIGNMENT) //==64 //页大小4kb #define SYSTEM_PAGE_SIZE (4 * 1024) #define SYSTEM_PAGE_SIZE_MASK (SYSTEM_PAGE_SIZE - 1) #define ARENA_SIZE (256 << 10) /* 256KB */ #ifdef WITH_MEMORY_LIMITS #define MAX_ARENAS (SMALL_MEMORY_LIMIT / ARENA_SIZE) #endif //这里定义的pool的大小为4k #define POOL_SIZE SYSTEM_PAGE_SIZE /* must be 2^N 4kb*/ #define POOL_SIZE_MASK SYSTEM_PAGE_SIZE_MASK //4*1024-1
上面的宏这里就不详细具体的说明了,,,它主要是确定了如下的信息
(1)一个Arena的大小为256KB(它用来管理pool)
(2)一个pool的大小为4kb
好了,接下来来看比较重要的pool的头定义:
//内存池头部 //通过szidx编号可以知道当前这个pool是用来分配多大大小的内存的pool struct pool_header { union { block *_padding; uint count; } ref; /* number of allocated blocks */ //当前pool上面分配的block的数量 block *freeblock; /* pool's free list head */ //指向下一个可用的block,这里构成了一个链表, 它是一个离散的链表,很有意思 struct pool_header *nextpool; /* next pool of this size class */ //通过这两个指针形成pool的双链表 struct pool_header *prevpool; /* previous pool "" */ uint arenaindex; /* index into arenas of base adr */ //在arena里面的索引 uint szidx; /* block size class index */ //分配内存的类别,8字节,16或者。。。 uint nextoffset; /* bytes to virgin block */ //下一个可用的block的内存偏移量 uint maxnextoffset; /* largest valid nextoffset */ //最后一个block距离开始位置的距离 };
上面的注释非常详细的说明了各个字段的用处。。。另外这里可以看到有一个szidx字段,它与上面内存分配时候的内存对齐表上的szidx相对应,其实每一个pool都是用来分配固定大小的内存的,例如szidx为0,那么这个pool就是用来分配8字节的,szidx为1就是用来分配16个字节的。。。。这个以后看代码就能明白。。。
这个样子每个pool都只分配一种大小的内存块就方便的多了。。特别是对于内存偏移的计算都相当的方便
python在分配小内存的时候,是按照block的单位来进行分配的,例如szidx为0的pool,它的一个block大小就是8字节。。通过freeblock指针来形成一个block的离散的单链表(嗯,这个实现也是非常的trick,看了好久才看明白)。。。
(嗯,其实内存池这部分的实现还有很多的trick,尼玛。。。看这些trick的实现真心消耗脑细胞啊。。。擦。。只能怪自己这方面确实才疏学浅,,要看这么久才能理解。。。)
//这个可以理解为用来管理pool struct arena_object { /* The address of the arena, as returned by malloc. Note that 0 * will never be returned by a successful malloc, and is used * here to mark an arena_object that doesn't correspond to an * allocated arena. */ uptr address; //指向分配的256kb的首地址,这里通过0来表明当前没有进行分配 /* Pool-aligned pointer to the next pool to be carved off. */ block* pool_address; /* The number of available pools in the arena: free pools + never- * allocated pools. */ uint nfreepools; //可用的pool /* The total number of pools in the arena, whether or not available. */ uint ntotalpools; //在当前arena的pool的总数 /* Singly-linked list of available pools. */ struct pool_header* freepools; //pool链表的头部 /* Whenever this arena_object is not associated with an allocated * arena, the nextarena member is used to link all unassociated * arena_objects in the singly-linked `unused_arena_objects` list. * The prevarena member is unused in this case. * * When this arena_object is associated with an allocated arena * with at least one available pool, both members are used in the * doubly-linked `usable_arenas` list, which is maintained in * increasing order of `nfreepools` values. * * Else this arena_object is associated with an allocated arena * all of whose pools are in use. `nextarena` and `prevarena` * are both meaningless in this case. */ struct arena_object* nextarena; struct arena_object* prevarena; };
上面这个是另外一个非常重要的结构,可以理解为它是用来管理pool的,它的address指针将会指向一个分配的256kb内存,pool将会在这个上面产生。。。。。
接下来先来看看Arena结构的创建过程吧:
//分配一个arena_object,其实这个也是做了缓存的 static struct arena_object* new_arena(void) { struct arena_object* arenaobj; //这里先创建一个arena的指针 uint excess; /* number of bytes above pool alignment */ void *address; //如果新创建的话,这个用来指向申请的256KB内存 int err; #ifdef PYMALLOC_DEBUG if (Py_GETENV("PYTHONMALLOCSTATS")) _PyObject_DebugMallocStats(); #endif if (unused_arena_objects == NULL) { //当前没有可用的arena,那么这里需要创建 uint i; uint numarenas; size_t nbytes; /* Double the number of arena objects on each allocation. * Note that it's possible for `numarenas` to overflow. */ //最开始maxarenas为0,也就是说第一次创建Arena结构的时候,就将会一次性创建16个,以后直接翻倍 numarenas = maxarenas ? maxarenas << 1 : INITIAL_ARENA_OBJECTS; //INITIAL_ARENA_OBJECTS=16 if (numarenas <= maxarenas) //出现这种情况,只能说尼玛,这都能整形溢出 啊 return NULL; /* overflow */ #if SIZEOF_SIZE_T <= SIZEOF_INT if (numarenas > PY_SIZE_MAX / sizeof(*arenas)) return NULL; /* overflow */ #endif nbytes = numarenas * sizeof(*arenas); //接下来分配arenas结构体所需要的内存 arenaobj = (struct arena_object *)realloc(arenas, nbytes); //分配内存地址 if (arenaobj == NULL) return NULL; arenas = arenaobj; /* We might need to fix pointers that were copied. However, * new_arena only gets called when all the pages in the * previous arenas are full. Thus, there are *no* pointers * into the old array. Thus, we don't have to worry about * invalid pointers. Just to be sure, some asserts: */ assert(usable_arenas == NULL); assert(unused_arena_objects == NULL); /* Put the new arenas on the unused_arena_objects list. */ //这里相当于是初始化刚刚创建的arena结构体 for (i = maxarenas; i < numarenas; ++i) { arenas[i].address = 0; //通过将这个地址赋值为0,表示当前arena没有分配可用的内存 arenas[i].nextarena = i < numarenas - 1 ? &arenas[i+1] : NULL; } unused_arena_objects = &arenas[maxarenas]; //这里将unused_arena_objects指向当前可用的第一个 maxarenas = numarenas; } /* Take the next available arena object off the head of the list. */ assert(unused_arena_objects != NULL); arenaobj = unused_arena_objects; unused_arena_objects = arenaobj->nextarena; //将unused_arena_objects指针指向下一个arena结构 assert(arenaobj->address == 0); //接下来分配数据内存 #ifdef ARENAS_USE_MMAP address = mmap(NULL, ARENA_SIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); err = (address == MAP_FAILED); #else address = malloc(ARENA_SIZE); //这里分配256字节的内存 err = (address == 0); #endif if (err) { /* The allocation failed: return NULL after putting the * arenaobj back. */ arenaobj->nextarena = unused_arena_objects; unused_arena_objects = arenaobj; return NULL; } //将Arena结构的address指向分配的地址 arenaobj->address = (uptr)address; //更新计数器 ++narenas_currently_allocated; #ifdef PYMALLOC_DEBUG ++ntimes_arena_allocated; if (narenas_currently_allocated > narenas_highwater) narenas_highwater = narenas_currently_allocated; #endif arenaobj->freepools = NULL; //这里pool头部指针设置为null /* pool_address <- first pool-aligned address in the arena nfreepools <- number of whole pools that fit after alignment */ arenaobj->pool_address = (block*)arenaobj->address; arenaobj->nfreepools = ARENA_SIZE / POOL_SIZE; //其实这里是64个可用的pool,正好对象64中类型的小内存分配 assert(POOL_SIZE * arenaobj->nfreepools == ARENA_SIZE); //下面是做一次内存对齐,最终保证pool_address的地址是4kb的整数倍,这个主要是方便以后内存计算 excess = (uint)(arenaobj->address & POOL_SIZE_MASK); if (excess != 0) { //这个意思是当前的内存地址不是4kb的整数倍,那么需要进行一次pool地址的对齐 --arenaobj->nfreepools; //可用数-1 arenaobj->pool_address += POOL_SIZE - excess; } arenaobj->ntotalpools = arenaobj->nfreepools; //总共可用的pool的数量 return arenaobj; }
嗯,代码上面的注释应该说的很清楚了吧。。。。创建结构体对象,然后分配内存,用adress指针来指向。。
初始化能分配的pool的数量,以及起始的地址。。。
好了。。。今天就先写到这吧。。。好晚了。。。感觉上班还真心有点累啊。。。。。还是在学校轻松。。。
明天在来分析最为重要的PyObject_Malloc和PyObject_Free两个函数吧。。。