cache一致性
On machines or bus configurations inwhich the hardware does not ensure cache coherence for DMA operations—such ascertain Intel Itanium systems—the standard Windows DMA implementation does theprocess-specific
work that is necessary to ensure such coherency when the driver calls WdfDmaTransactionExecute.
The framework flushes this cache when the driver callsWdfDmaTransactionDmaCompleted.
Map寄存器
在系统地址空间和物理地址空间 翻译地址。类似于页表项和页目录项用于物理地址和虚拟地址之间的转换一样。每个map寄存器可以转换高达4KB的地址。Map寄存器是一种共享资源。系统根据创建DMA enable 对象时的最大传输长度reserves map寄存器。
框架会在执行transaction事务前立即分配map registers,在transaction完成后释放。一旦DMA设备的驱动程序从或者向系统内存拷贝数据,框架会动态分配 以及编程map registers.
Number of Required Map Registers =(Transfer Size / PAGE_SIZE) + 1
加1是针对一次传输的起始地址不是页对齐地址。map registers被分配在一个连续的块中。Windows操作系统将一块map registers (基地址和长度)给框架。其值 KMDF驱动无法获知。
举个例子:
PLX9x5x的驱动中在初始化设备扩展结构中有如下代码:
NTSTATUS PLxInitializeDeviceExtension( IN PDEVICE_EXTENSION DevExt ) /*++ Routine Description: This routine is called by EvtDeviceAdd. Here the device context is initialized and all the software resources required by the device is allocated. Arguments: DevExt Pointer to the Device Extension Return Value: NTSTATUS --*/ { NTSTATUS status; ULONG dteCount; WDF_IO_QUEUE_CONFIG queueConfig; PAGED_CODE(); // // Set Maximum Transfer Length (which must be less than the SRAM size). // DevExt->MaximumTransferLength = PCI9656_MAXIMUM_TRANSFER_LENGTH; if(DevExt->MaximumTransferLength > PCI9656_SRAM_SIZE) { DevExt->MaximumTransferLength = PCI9656_SRAM_SIZE; } KdPrint(( "MaximumTransferLength %d\n", DevExt->MaximumTransferLength)); // // Calculate the number of DMA_TRANSFER_ELEMENTS + 1 needed to // support the MaximumTransferLength. // dteCount = BYTES_TO_PAGES((ULONG) ROUND_TO_PAGES( DevExt->MaximumTransferLength) + PAGE_SIZE); KdPrint(( "Number of DTEs %d\n", dteCount)); // // Set the number of DMA_TRANSFER_ELEMENTs (DTE) to be available. // DevExt->WriteTransferElements = dteCount; DevExt->ReadTransferElements = dteCount;
}
有用硬件实现的,也有用软件实现的。
什么时候用map registers? 满足下列任一项
1 设备不支持S/G
2 buffer已经超越了设备的寻址能力,如只支持32位传输,但是却要传到高4GB的物理地址空间,系统会使用map registers.
SG的实现
在初始化过程中,如果设备不支持硬件SG,Windows DMA 使用map registers.
1 Windows DMA分配足够连续的map registers,即low-memory 缓冲区,包含整个传输的数据;
2 对一个写操作来说,从系统内存到设备,数据从原有的数据缓冲区到map registers缓冲区。
3 WIndows DMA 给框架提供map registers的物理内存地址 作为 缓冲区在设备总线地址空间的地址。
然后 框架吧sg list的地址传送给驱动程序,当调用驱动的EvtProgramDma 回调函数时。
BOOLEAN PLxEvtProgramWriteDma( IN WDFDMATRANSACTION Transaction, IN WDFDEVICE Device, IN PVOID Context, IN WDF_DMA_DIRECTION Direction, IN PSCATTER_GATHER_LIST SgList ) /*++ Routine Description: Arguments: Return Value: --*/ { PDEVICE_EXTENSION devExt; size_t offset; PDMA_TRANSFER_ELEMENT dteVA; ULONG_PTR dteLA; BOOLEAN errors; ULONG i; UNREFERENCED_PARAMETER( Context ); UNREFERENCED_PARAMETER( Direction ); KdPrint(("--> PLxEvtProgramWriteDma\n")) ; // // Initialize locals // devExt = PLxGetDeviceContext(Device); errors = FALSE; // // Get the number of bytes as the offset to the beginning of this // Dma operations transfer location in the buffer. // offset = WdfDmaTransactionGetBytesTransferred(Transaction); KdPrint(("offset is %d\n",offset)); // // Setup the pointer to the next DMA_TRANSFER_ELEMENT // for both virtual and physical address references. // dteVA = (PDMA_TRANSFER_ELEMENT) devExt->WriteCommonBufferBase; dteLA = (devExt->WriteCommonBufferBaseLA.LowPart + sizeof(DMA_TRANSFER_ELEMENT)); // // Translate the System's SCATTER_GATHER_LIST elements // into the device's DMA_TRANSFER_ELEMENT elements. // for (i=0; i < SgList->NumberOfElements; i++) { // // Construct this DTE. // // NOTE: The LocalAddress is the offset into the SRAM from // where this Write will start. // dteVA->PciAddressLow = SgList->Elements[i].Address.LowPart; dteVA->PciAddressHigh = SgList->Elements[i].Address.HighPart; dteVA->TransferSize = SgList->Elements[i].Length; dteVA->LocalAddress = (ULONG) offset; dteVA->DescPtr.DescLocation = DESC_PTR_DESC_LOCATION__PCI; dteVA->DescPtr.TermCountInt = FALSE; dteVA->DescPtr.LastElement = FALSE; dteVA->DescPtr.DirOfTransfer = DESC_PTR_DIRECTION__TO_DEVICE; dteVA->DescPtr.Address = DESC_PTR_ADDR( dteLA ); // // Increment the DmaTransaction length by this element length // offset += SgList->Elements[i].Length; // // If at end of SgList, then set LastElement bit in final NTE. // if (i == SgList->NumberOfElements - 1) { dteVA->DescPtr.LastElement = TRUE; } }
因为 缓冲区是物理连续的,所以SG表只有一个元素,基地址和长度。
驱动使用该长度和地址,其实不是真实的数据缓冲区片段的地址编程DMA。因为map register buffers 在低4GB 物理地址空间,所以该buffers可以被任何
总线主控DMA设备寻址到。
当驱动程序通知框架DMA传输已经完成,框架转而通知standard Windows DMA实现层,
map register可以使具有有限寻址能力的设备访问到任一内存位置。