Article 1:Loading Win32/64 DLLs "manually" without LoadLibrary()
The most important steps of DLL loading are:
- Mapping or loading the DLL into memory.
- Relocating offsets in the DLL using the relocating table of the DLL (if present).
- Resolving the dependencies of the DLL, loading other DLLs needed by this DLL and resolving the offset of the needed functions.
- Calling its entrypoint (if present) with the
DLL_PROCESS_ATTACH
parameter.
I wrote the code that performed these steps but then quickly found out something is not OK: This loaded DLL doesn‘t have a valid HMODULE
/HINSTANCE
handle and many windows functions expect you to specify one (for example, GetProcAddress()
, CreateDialog()
, and so on...). Actually the HINSTANCE
handle of a module is nothing more than the address of the DOS/PE header of the loaded DLL in memory.I tried to pass this address to the functions but it didn‘t work because windows checks whether this handle is really a handle and not only the contents of memory! This makes using manually loaded DLLs a bit harder!I had to write my own GetProcAddress()
because the windows version didn‘t work with my DLLs.Later I found out that I want to use dialog resources in the DLL and CreateDialog()
also requires a module handle to get the dialog resources from the DLL. For this reason I invented my custom FindResource()
function that works with manually loaded DLLs and it can be used to find dialog resources that can be passed to the CreateDialogIndirect()
function. You can use other types of resources as well in manually loaded DLLs if you find a function for that resource that cooperates with FindResource()
. In this tip you get the code for the manual DLL loader and GetProcAddress()
, but I post here the resource related functions in another tip.
Limitations
- The loaded DLL doesn‘t have a
HMODULE
so it makes life harder especially when its about resources. - The
DllMain()
doesn‘t receiveDLL_THREAD_ATTACH
andDLL_THREAD_DETACH
notifications. You could simulate this by creating a small DLL that you load with normalLoadLibrary()
and from theDllMain()
of this normally loaded DLL you could call the entrypoint of your manually loaded DLLs in case ofDLL_THREAD_ATTACH/DLL_THREAD_DETACH
.(建立一个普通的用LoadLibrary()加载起来的dll,从这个正常加载起来的dll的DLLMain(),你可以调用手动加载的DLL的入口点) - If your DLL imports other DLLs, then the other DLLs are loaded with the WinAPI
LoadLibrary()
. This is actually not a limitation, just mentioned it for your information. Actually it would be useless to start loading for example kernel32.dll with manual dll loading, most system DLLs would probably disfunction/crash! - DLLs that make use of SEH *may* fail. The fact that the DLL contains SEH related code alone isn‘t a problem but the __try blocks in the loaded DLL won‘t be able to catch the exceptions because the
ntdll.dll!RtlIsValidHandler()
doesn‘t accept exception handler routines from the memory area of our manually loaded DLL (because this memory area isn‘t mapped from a PE file). This is a problem only if an exception is raised inside a __try block of the DLL (because windows can‘t run the exception handler of the DLL and raises another exception that escapes the exception handler of the DLL - the result is usually a crash). - Whether the CRT works with manual DLL loading or not depends on several things. It depends on the actual CRT version you are using and the functions you call from the CRT. If you are using just a few simple functions (like printf) then the CRT may work. I‘ve written my DLLs with /NODEFAULTLIB linker option that means you can‘t reach CRT functions and it reduces your DLL size considerably (like with 4K intros). But then you have to go with pure WinAPI! This can be quite inconvenient but you can overcome this by writing your own mini CRT. I‘ve provided one such mini CRT in my C++ example without attempting to be comprehensive but it at least allows you to use the most basic C++ features: automatically initialized static variables, new/delete operators. BTW, if you are about to use this code then you should understand most of these problems and you should appreciate that writing C/C++ DLL without CRT is still much more convenient than writing something as an offset independent or relocatable assembly patch.
源码分析:
- TestDLL
定义一个结构体DLLInterface类型:里边是两个函数类型的指针AddNumbers和MyMessageBox
typedef struct DLLInterface { int (*AddNumbers)(int a, int b); void (*MyMessageBox)(const char* message); } DLLInterface;
对应的cpp文件中定义了两个函数:
int AddNumbers(int a, int b) { printf("DLL: AddNumbers(%d, %d)\n", a, b); return a + b; } void MyMessageBox(const char* message) { printf("DLL: MyMessageBox(\"%s\")\n", message); MessageBoxA(NULL, message, "DLL MessageBox!", MB_OK); }
然后将这两个函数作为指针传给上面定义的这样的一个结构体:
DLLInterface g_Interface = { AddNumbers, MyMessageBox };
将函数存储在dll文件的导出表中,__declspec(dllexport)表示被包含这个函数的程序之外的程序调用:
__declspec(dllexport) const DLLInterface* GetDLLInterface() { return &g_Interface; }
DLLMain:
BOOL WINAPI DllMain(HINSTANCE hinstDLL, DWORD fdwReason, LPVOID lpvReserved) { // With manual DLL loading you can not use DLL_THREAD_ATTACH and DLL_THREAD_DETACH. switch (fdwReason) { case DLL_PROCESS_ATTACH: printf("DLL: DLL_PROCESS_ATTACH\n"); // TODO break; case DLL_PROCESS_DETACH: printf("DLL: DLL_PROCESS_DETACH\n"); // TODO break; default: break; } return TRUE; }
- LoadDLL:
这里是模拟加载DLL(LoadLibrary):
MODULE_HANDLE LoadModule(const char* dll_path) { LOAD_DLL_INFO* p = new LOAD_DLL_INFO; DWORD res = LoadDLLFromFileName(dll_path, 0, p); if (res != ELoadDLLResult_OK) { delete p; return NULL; } return p; }
LoadDLLFromFileName中调用了:
ELoadDLLResult LoadDLLFromFileNameOffset(const char* filename, size_t dll_offset, size_t dll_size, int flags, LOAD_DLL_INFO* info) { ELoadDLLResult res; FILE* f = fopen(filename, "rb");//filename就是testDLL的路径 if (!f) return ELoadDLLResult_DLLFileNotFound; res = LoadDLLFromCFile(f, dll_offset, dll_size, flags, info); fclose(f); return res; }
再看LoadDLLFromCFile中的内容,这里值得说明:
ELoadDLLResult LoadDLLFromCFile(FILE* f, size_t dll_offset, size_t dll_size, int flags, LOAD_DLL_INFO* info) { LOAD_DLL_FROM_FILE_STRUCT ldffs = { f, dll_offset, dll_size }; return LoadDLL((LOAD_DLL_READPROC)&LoadDLLFromFileCallback, &ldffs, flags, info); }
LOAD_DLL_READPROC是一个函数指针类型,定义如下:
//LOAD_DLL_READPROC是一个函数指针typedef BOOL (*LOAD_DLL_READPROC)(void* buff, size_t position, size_t size, void* param);
将LoadDLLFromFileCallback作为参数传递给LoadDLL."LoadDLLFromFileCallback"这个函数起名的用意是显然的,就是为了说明这是一个CALLBACK函数
这个CALLBACK函数的源码如下:
static BOOL LoadDLLFromFileCallback(void* buff, size_t position, size_t size, LOAD_DLL_FROM_FILE_STRUCT* param) { if (!size) return TRUE; if ((position + size) > param->dll_size) return FALSE; fseek(param->f, param->dll_offset + position, SEEK_SET); return fread(buff, 1, size, param->f) == size;//用fread的形式把数据读入内存,起到Load的作用 }
说白了,CALLBACK之所以叫回调,是因为他是为了“被(作为参数)调用”而存在的,而不是为了来调用别的函数。这里,LoadDLLFromFileCallback被LoadDLL调用,用和LoadDLLFromFileCallback函数同类型的函数指针作为参数,传入LoadDLL,也就是说传入LoadDLLFromFileCallback函数供LoadDLL使用。其实CALLBACK函数不止这一个,还有如下这个函数:
static BOOL LoadDLLFromMemoryCallback(void* buff, size_t position, size_t size, LOAD_DLL_FROM_MEMORY_STRUCT* param) { if (!size) return TRUE; if ((position + size) > param->dll_size) return FALSE; memcpy(buff, (char*)param->dll_data + position, size);//拷贝指定位置开始的指定大小的数据到buff return TRUE; }
LoadDLLFromMemoryCallback和LoadDLLFromFileCallback同为CALLBACK函数,他们的模型是一样的,他们都是用来被其它函数调用的,LoadDLL根据自身想实现的不同功能,选择调用不同的CALLBACK函数而已。对比一下LoadDLLFromCFile,LoadDLLFromMemory其实只是在其内部调用的时候使用了不同的CALLBACK函数作为参数而已:
DWORD LoadDLLFromMemory(const void* dll_data, size_t dll_size, int flags, LOAD_DLL_INFO* info) { LOAD_DLL_FROM_MEMORY_STRUCT ldfms = { dll_data, dll_size }; return LoadDLL ((LOAD_DLL_READPROC)&LoadDLLFromMemoryCallback, &ldfms, flags, info); }
LoadDLL函数源码部分:
这里可能是重点部分,需要学习研究的点也很多。
ELoadDLLResult LoadDLL(LOAD_DLL_READPROC read_proc, void* read_proc_param, int flags, LOAD_DLL_INFO* info) { LOAD_DLL_CONTEXT ctx; ELoadDLLResult res; BOOL finished_successfully = FALSE; unsigned i; if (!read_proc) return ELoadDLLResult_WrongFunctionParameters; ctx.sect = NULL; ctx.loaded_import_modules_array = NULL; ctx.import_modules_array_capacity = 0; ctx.num_import_modules = 0; ctx.dll_main = NULL; __try { __try { res = LoadDLL_LoadHeaders(&ctx, read_proc, read_proc_param);//加载了DOS头、PE头、Section Table if (res != ELoadDLLResult_OK) return res; res = LoadDLL_AllocateMemory(&ctx, flags);//开辟了一块能容纳Sections大小的虚拟空间 if (res != ELoadDLLResult_OK) return res; __try { res = LoadDLL_LoadSections(&ctx, read_proc, read_proc_param, flags);//Load Sections和Sections之前的内容 if (res != ELoadDLLResult_OK) return res; res = LoadDLL_PerformRelocation(&ctx); if (res != ELoadDLLResult_OK) return res; res = LoadDLL_ResolveImports(&ctx);//填充IAT if (res != ELoadDLLResult_OK) return res; res = LoadDLL_SetSectionMemoryProtection(&ctx); if (res != ELoadDLLResult_OK) return res; res = LoadDLL_CallDLLEntryPoint(&ctx, flags); if (res != ELoadDLLResult_OK) return res; /* We finished!!! :) Filling in the callers info structure... */ if (info) { __try { info->size = sizeof(*info); info->flags = flags; info->image_base = ctx.image_base; info->mem_block = ctx.image; info->dll_main = ctx.dll_main; info->export_dir_rva = ctx.hdr.OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress; info->loaded_import_modules_array = ctx.loaded_import_modules_array; info->num_import_modules = ctx.num_import_modules; } __except (EXCEPTION_EXECUTE_HANDLER) { return ELoadDLLResult_WrongFunctionParameters; } } finished_successfully = TRUE; return ELoadDLLResult_OK; } __finally { if (!finished_successfully) VirtualFree(ctx.image, 0, MEM_RELEASE); if (!finished_successfully || !info) { if (ctx.loaded_import_modules_array) { for (i=0; i<ctx.num_import_modules; ++i) FreeLibrary(ctx.loaded_import_modules_array[i]); free(ctx.loaded_import_modules_array); } } } } __finally { if (ctx.sect) free(ctx.sect); } } __except (EXCEPTION_EXECUTE_HANDLER) { return ELoadDLLResult_UnknownError; } }
模拟的FreeLibrary:
bool UnloadModule(MODULE_HANDLE handle) { bool res = FALSE != UnloadDLL(handle); delete handle; return res; }
BOOL UnloadDLL(LOAD_DLL_INFO* info) { unsigned i; BOOL res = TRUE; __try { if (!info || info->size!=sizeof(*info) || !info->image_base || !info->mem_block) return FALSE; if (info->loaded_import_modules_array)//存放的是HMODULE { for (i=0; i<info->num_import_modules; ++i) FreeLibrary(info->loaded_import_modules_array[i]); free(info->loaded_import_modules_array); } if (!(info->flags & ELoadDLLFlag_NoEntryCall) && info->dll_main) { __try { res = info->dll_main(info->image_base, DLL_PROCESS_DETACH, NULL);//执行DLLmain的detach } __except (EXCEPTION_EXECUTE_HANDLER) { res = FALSE; } } VirtualFree(info->mem_block, 0, MEM_RELEASE); return res; } __except (EXCEPTION_EXECUTE_HANDLER) { return FALSE; } }
模拟的GetProcAddress:
FARPROC MyGetProcAddress(HMODULE module, const char* func_name) { IMAGE_NT_HEADERS* hdr; __try { if (((IMAGE_DOS_HEADER*)module)->e_magic != IMAGE_DOS_SIGNATURE) return NULL; hdr = (IMAGE_NT_HEADERS*)((DWORD_PTR)module + ((IMAGE_DOS_HEADER*)module)->e_lfanew); if (hdr->Signature != IMAGE_NT_SIGNATURE || hdr->OptionalHeader.Magic != IMAGE_NT_OPTIONAL_HDR_MAGIC) return NULL; return MyGetProcAddress_ExportDir( hdr->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress, (DWORD_PTR)module, func_name ); } __except (EXCEPTION_EXECUTE_HANDLER) { return NULL; } }
里边调用了:MyGetProcAddress_ExportDir,其实就是从export中获得函数的地址
FARPROC MyGetProcAddress_ExportDir(DWORD export_dir_rva, DWORD_PTR image_base, const char* func_name) { IMAGE_EXPORT_DIRECTORY* exp; DWORD_PTR ord; DWORD i; if (!export_dir_rva) return NULL; exp = (IMAGE_EXPORT_DIRECTORY*)(image_base + export_dir_rva); ord = (DWORD_PTR)func_name; __try { if (ord < 0x10000) { /* Search for ordinal. */ if (ord < exp->Base) return NULL; ord -= exp->Base; } else { /* Search for name. */ for (i=0; i<exp->NumberOfNames; ++i) { if ( !strcmp( (char*)(((DWORD*)(exp->AddressOfNames + image_base))[i] + image_base), func_name) ) { ord = ((WORD*)(exp->AddressOfNameOrdinals + image_base))[i]; break; } } } if (ord >= exp->NumberOfFunctions) return NULL; return (FARPROC)(((DWORD*)(exp->AddressOfFunctions + image_base))[ord] + image_base); } __except (EXCEPTION_EXECUTE_HANDLER) { return NULL; } }
源码下载、说明:
http://www.codeproject.com/Tips/430684/Loading-Win-DLLs-manually-without-LoadLibrary
下面是一个德国人写的Loading a DLL from memory
This tutorial describes a technique how a dynamic link library (DLL) can be loaded from memory without storing it on the hard-disk first.
To emulate the PE loader, we must first understand, which steps are neccessary to load the file to memory and prepare the structures so they can be called from other programs.
When issuing the API call LoadLibrary, Windows basically performs these tasks:
- Open the given file and check the DOS and PE headers.
- Try to allocate a memory block of PEHeader.OptionalHeader.SizeOfImage bytes at position PEHeader.OptionalHeader.ImageBase.
- Parse section headers and copy sections to their addresses. The destination address for each section, relative to the base of the allocated memory block, is stored in the VirtualAddress attribute of the IMAGE_SECTION_HEADER structure.
- If the allocated memory block differs from ImageBase, various references in the code and/or data sections must be adjusted. This is called Base relocation.
- The required imports for the library must be resolved by loading the corresponding libraries.
- The memory regions of the different sections must be protected depending on the section’s characteristics. Some sections are marked as discardable and therefore can be safely freed at this point. These sections normally contain temporary data that is only needed during the import, like the informations for the base relocation.
- Now the library is loaded completely. It must be notified about this by calling the entry point using the flag DLL_PROCESS_ATTACH.
Allocate memory
All memory required for the library must be reserved / allocated using VirtualAlloc, as Windows provides functions to protect these memory blocks. This is required to restrict access to the memory, like blocking write access to the code or constant data.
The OptionalHeader structure defines the size of the required memory block for the library. It must be reserved at the address specified by ImageBase if possible:
If the reserved memory differs from the address given in ImageBase, base relocation as described below must be done.
Copy sections
Once the memory has been reserved, the file contents can be copied to the system. The section header must get evaluated in order to determine the position in the file and the target area in memory.
Before copying the data, the memory block must get committed:
Base relocation
All memory addresses in the code / data sections of a library are stored relative to the address defined by ImageBase in the OptionalHeader. If the library can’t be imported to this memory address, the references must get adjusted => relocated. The file format helps for this by storing informations about all these references in the base relocation table, which can be found in the directory entry 5 of the DataDirectory in the OptionalHeader.
This table consists of a series of this structure
It contains (SizeOfBlock – IMAGE_SIZEOF_BASE_RELOCATION) / 2 entries of 16 bits each. The upper 4 bits define the type of relocation, the lower 12 bits define the offset relative to the VirtualAddress.
The only types that seem to be used in DLLs are
- IMAGE_REL_BASED_ABSOLUTE
- No operation relocation. Used for padding.
- IMAGE_REL_BASED_HIGHLOW
- Add the delta between the ImageBase and the allocated memory block to the 32 bits found at the offset.
Resolve imports
The directory entry 1 of the DataDirectory in the OptionalHeader specifies a list of libraries to import symbols from. Each entry in this list is defined as follows:
The Name entry describes the offset to the NULL-terminated string of the library name (e.g. KERNEL32.DLL). The OriginalFirstThunk entry points to a list of references to the function names to import from the external library. FirstThunk points to a list of addresses that gets filled with pointers to the imported symbols.
When we resolve the imports, we walk both lists in parallel, import the function defined by the name in the first list and store the pointer to the symbol in the second list:
Protect memory
Every section specifies permission flags in it’s Characteristics entry. These flags can be one or a combination of
- IMAGE_SCN_MEM_EXECUTE
- The section contains data that can be executed.
- IMAGE_SCN_MEM_READ
- The section contains data that is readable.
- IMAGE_SCN_MEM_WRITE
- The section contains data that is writeable.
These flags must get mapped to the protection flags
- PAGE_NOACCESS
- PAGE_WRITECOPY
- PAGE_READONLY
- PAGE_READWRITE
- PAGE_EXECUTE
- PAGE_EXECUTE_WRITECOPY
- PAGE_EXECUTE_READ
- PAGE_EXECUTE_READWRITE
Now, the function VirtualProtect can be used to limit access to the memory. If the program tries to access it in a unauthorized way, an exception gets raised by Windows.
In addition the section flags above, the following can be added:
- IMAGE_SCN_MEM_DISCARDABLE
- The data in this section can be freed after the import. Usually this is specified for relocation data.
- IMAGE_SCN_MEM_NOT_CACHED
- The data in this section must not get cached by Windows. Add the bit flag PAGE_NOCACHE to the protection flags above.
Notify library
The last thing to do is to call the DLL entry point (defined by AddressOfEntryPoint) and so notifying the library about being attached to a process.
The function at the entry point is defined as
typedef BOOL (WINAPI *DllEntryProc)(HINSTANCE hinstDLL, DWORD fdwReason, LPVOID lpReserved);
DllEntryProc entry = (DllEntryProc)(baseAddress + PEHeader->OptionalHeader.AddressOfEntryPoint);
(*entry)((HINSTANCE)baseAddress, DLL_PROCESS_ATTACH, 0);
Afterwards we can use the exported functions as with any normal library.
LoadLibrary执行的任务就到此结束了。
Exported functions
If you want to access the functions that are exported by the library, you need to find the entry point to a symbol, i.e. the name of the function to call.
The directory entry 0 of the DataDirectory in the OptionalHeader contains informations about the exported functions. It’s defined as follows:
First thing to do, is to map the name of the function to the ordinal number of the exported symbol. Therefore, just walk the arrays defined by AddressOfNames and AddressOfNameOrdinals parallel until you found the required name.
Now you can use the ordinal number to read the address by evaluating the n-th element of the AddressOfFunctions array.
Freeing the library
To free the custom loaded library, perform the steps
DllEntryProc entry = (DllEntryProc)(baseAddress + PEHeader->OptionalHeader.AddressOfEntryPoint);
(*entry)((HINSTANCE)baseAddress, DLL_PROCESS_ATTACH, 0);
- Free external libraries used to resolve imports.
- Free allocated memory.
MemoryModule
MemoryModule is a C-library that can be used to load a DLL from memory.
The interface is very similar to the standard methods for loading of libraries:
原文参考:
http://www.joachim-bauch.de/tutorials/loading-a-dll-from-memory/
源码参考:
https://github.com/fancycode/MemoryModule