Linux mips64r2 PCI中断路由机制分析

本文主要分析mips64r2 PCI设备中断路由原理和irq号分配实现方法，并尝试回答如下问题：

PCI设备驱动中断注册(request_irq)时的irq#从哪里来？是硬件相关？还是软件相关？
中断上报时，CPU是如何获得这个irq#的？

本文主要分析PIC（可编程中断控制器）的工作原理，PIC一般集成在CPU中，不同arch、vendor CPU的PIC实现原理也不尽相同。本文基于kerne3.10 + mips64r2 XXX CPU分析。

mips64r2 PCI设备中断路由原理

如上图所示，硬件实现上，PCI中断路由主要涉及3个设备：PCI设备、PIC、CPU。

PIC作为核心器件，其核心功能如下：

160个32位IRT，依次对应160个硬件Interrupt Lines；
64个interrupt vector；
8个128位ITE(Interrupt Thread Enable)，包含了4个节点的128个硬件线程使能位；
局部和全局Round-Robin策略分发中断到相应硬件线程；
8个系统定时器，2个看门狗定时器（可配置成NMI看门狗定时器，不同的IRT Entry）；
支持IPI。

3个主要中断信号：

interrupt pin：PCI设备中断输出信号。PCI设备提供4个中断输出(INTA#, INTB#, INTC#, INTD#)，由PCI设备的pci configure space中interrupt pin指定；
interrupt line：PIC设备输入信号，与PCI设备中断输出信号相连。由PCI设备的pci configure space中interrupt line指定；
interrupt vector：PIC设备输出信号，与CPU中断输入信号相连。这个interrupt vector即PCI设备驱动中断注册的中断号irq#，可通过CPU的EIRR(extended interrupt request register)寄存器读取。由于mips64r2体系架构的限制，EIRR为64bit，每个bit代表一个vector，所以最多64个vector；

软件实现上，抽象出2个表对象来实现中断路由的管理和处理：

IRT：interrupt redirection table。硬件表，index为interrupt line#，共160个条目，用于维护interrupt line和interrupt vector的映射关系, interrupt line mask/unmask、enable/disable等控制， interrupt的CPU亲和性设置等；
IDT：interrupt description table。软件表，index为interrupt vector#，共64个条目，用与中断处理。

中断处理过程

如上图所示，中断由产生到结束的整个过程：

handle_int

-> plat_irq_dispatch

-> do_nlm_common_IRQ

-> do_IRQ

-> generic_handle_irq

-> generic_handle_irq_desc

-> __do_IRQ

-> handle_IRQ_event

硬件设备产生中断(request & pending)；

不同的设备中断请求可能同时到达，PIC通过仲裁规则（如Round-Robin、优先级等）挑选出一个合适的请求(arbiter)；
PIC设置此中断相关的ACK位(assert)，分发(delivery)请求到目的硬件线程(通过IRT的配置)；
CPU读取EIRR寄存器获取request irq#后写清除；
根据irq#查询IDT表获得此中断的desc；
如果是边沿触发中断，为了避免中断丢失，立即调用desc->chip->ack写相关寄存器(INT_ACK)清除本次中断源(de-assert)，使相应interrupt line可以再次响应中断；
遍历desc->action->handler链处理中断请求；
如果是水平触发中断，在处理完中断后，调用desc->chip->end写相关寄存器(INT_ACK)清除本次中断源(de-assert)；
转2)继续处理下一个中断请求。

IRT配置过程

IRT为PIC控制器的硬件表，主要在pic_init和request_irq中断注册时配置。IRT条目和字段解释如下：

EN：IRT条目使能配置字段。在request_irq中断注册时使能；
NMI：NMI中断配置字段。在pic_init中设置为非NMI；
SCH：中断调度策略配置字段。在pic_init中设为locascheduling；
RVEC：irq#配置字段。在pic_init中根据irt_irq_table设置；
DT/DB/DTE：CPU亲和性配置字段。在request_irq中断注册时配置，也可以通过/proc/irq/N/smp_affinity设置，由desc->chip->set_affinity/pic_set_affinity实现。

IDT配置过程

IDT为软件表，主要在init_IRQ和request_irq中断注册时根据irq#配置IDT的相应条目的各个字段，如irq, irqaction handler, irq_chip handler等。2个配置过程区分如下：

init_IRQ：配置每个条目的desc->chip，设置desc->status为IRQ_NOPROBE；初始化非PIC相关中断IDT条目，如IPI核间中断等。
request_irq：初始化PIC相关中断IDT条目；

#define NR_IRQS 64
struct irq_desc irq_desc[NR_IRQS] __cacheline_aligned_in_smp = {
    [0 ... NR_IRQS-1] = {
        .status = IRQ_DISABLED,
        .chip = &no_irq_chip,
        .handle_irq = handle_bad_irq,
        .depth = 1,
        .lock = __SPIN_LOCK_UNLOCKED(irq_desc->lock),
    }
};

/**
 * struct irq_desc - interrupt descriptor
 * @irq:        interrupt number for this descriptor
 * @timer_rand_state:    pointer to timer rand state struct
 * @kstat_irqs:        irq stats per cpu
 * @irq_2_iommu:    iommu with this irq
 * @handle_irq:        highlevel irq-events handler [if NULL, __do_IRQ()]
 * @chip:        low level interrupt hardware access
 * @msi_desc:        MSI descriptor
 * @handler_data:    per-IRQ data for the irq_chip methods
 * @chip_data:        platform-specific per-chip private data for the chip
 *            methods, to allow shared chip implementations
 * @action:        the irq action chain
 * @status:        status information
 * @depth:        disable-depth, for nested irq_disable() calls
 * @wake_depth:        enable depth, for multiple set_irq_wake() callers
 * @irq_count:        stats field to detect stalled irqs
 * @last_unhandled:    aging timer for unhandled count
 * @irqs_unhandled:    stats field for spurious unhandled interrupts
 * @lock:        locking for SMP
 * @affinity:        IRQ affinity on SMP
 * @node:        node index useful for balancing
 * @pending_mask:    pending rebalanced interrupts
 * @threads_active:    number of irqaction threads currently running
 * @wait_for_threads:    wait queue for sync_irq to wait for threaded handlers
 * @dir:        /proc/irq/ procfs entry
 * @name:        flow handler name for /proc/interrupts output
 */
struct irq_desc {
    unsigned int        irq;
    struct timer_rand_state *timer_rand_state;
    unsigned int            *kstat_irqs;
#ifdef CONFIG_INTR_REMAP
    struct irq_2_iommu      *irq_2_iommu;
#endif
    irq_flow_handler_t    handle_irq;
    struct irq_chip        *chip;
    struct msi_desc        *msi_desc;
    void            *handler_data;
    void            *chip_data;
    struct irqaction    *action;    /* IRQ action list */
    unsigned int        status;        /* IRQ status */

    unsigned int        depth;        /* nested irq disables */
    unsigned int        wake_depth;    /* nested wake enables */
    unsigned int        irq_count;    /* For detecting broken IRQs */
    unsigned long        last_unhandled;    /* Aging timer for unhandled count */
    unsigned int        irqs_unhandled;
    spinlock_t        lock;
#ifdef CONFIG_SMP
    cpumask_var_t        affinity;
    unsigned int        node;
#ifdef CONFIG_GENERIC_PENDING_IRQ
    cpumask_var_t        pending_mask;
#endif
#endif
    atomic_t        threads_active;
#ifdef CONFIG_PREEMPT_HARDIRQS
    unsigned long        forced_threads_active;
#endif
    wait_queue_head_t       wait_for_threads;
#ifdef CONFIG_PROC_FS
    struct proc_dir_entry    *dir;
#endif
    const char        *name;
} ____cacheline_internodealigned_in_smp;

struct irq_chip {
    const char    *name;
    unsigned int    (*startup)(unsigned int irq);
    void        (*shutdown)(unsigned int irq);
    void        (*enable)(unsigned int irq);
    void        (*disable)(unsigned int irq);

    void        (*ack)(unsigned int irq);
    void        (*mask)(unsigned int irq);
    void        (*mask_ack)(unsigned int irq);
    void        (*unmask)(unsigned int irq);
    void        (*eoi)(unsigned int irq);

    void        (*end)(unsigned int irq);
    int        (*set_affinity)(unsigned int irq,
                    const struct cpumask *dest);
    int        (*retrigger)(unsigned int irq);
    int        (*set_type)(unsigned int irq, unsigned int flow_type);
    int        (*set_wake)(unsigned int irq, unsigned int on);

    void        (*bus_lock)(unsigned int irq);
    void        (*bus_sync_unlock)(unsigned int irq);

    /* Currently used only by UML, might disappear one day.*/
#ifdef CONFIG_IRQ_RELEASE_METHOD
    void        (*release)(unsigned int irq, void *dev_id);
#endif
    /*
     * For compatibility, ->typename is copied into ->name.
     * Will disappear.
     */
    const char    *typename;
};

static struct irq_chip nlm_common_pic = {
    .unmask = pic_unmask,
    .mask = pic_shutdown,
    .ack = pic_ack,
    .end = pic_end,
    .set_affinity = pic_set_affinity
};

--EOF--

时间： 2024-10-13 11:57:02

Linux mips64r2 PCI中断路由机制分析的相关文章

Linux x86_64 APIC中断路由机制分析

不同CPU体系间的中断控制器工作原理有较大差异,本文是<Linux mips64r2 PCI中断路由机制分析>的姊妹篇,主要分析Broadwell-DE X86_64 APIC中断路由原理.中断配置和处理过程,并尝试回答如下问题: 为什么x86中断路由使用IO-APIC/LAPIC框架,其有什么价值? pin/irq/vector的区别.作用,取值范围和分配机制? x86_64 APIC关键概念 Pin 此处的pin特指APIC的中断输入引脚,与内外部设备的中断输入信号相连.从上图中可以看出,

Linux X86下的TLB机制分析

TLB - translation lookaside buffer 快表,直译为翻译后备缓冲器,也可以理解为页表缓冲,地址变换高速缓存. 由于页表存放在主存中,因此程序每次访存至少需要两次:一次访存获取物理地址,第二次访存才获得数据.提高访存性能的关键在于依靠页表的访问局部性.当一个转换的虚拟页号被使用时,它可能在不久的将来再次被使用到,. TLB是一种高速缓存,内存管理硬件使用它来改善虚拟地址到物理地址的转换速度.当前所有的个人桌面,笔记本和服务器处理器都使用TLB来进行虚拟地址到物理地址的

Linux内核NAPI机制分析

转自:http://blog.chinaunix.net/uid-17150-id-2824051.html 简介:NAPI 是 Linux 上采用的一种提高网络处理效率的技术,它的核心概念就是不采用中断的方式读取数据,而代之以首先采用中断唤醒数据接收的服务程序,然后 POLL 的方法来轮询数据.随着网络的接收速度的增加,NIC 触发的中断能做到不断减少,目前 NAPI 技术已经在网卡驱动层和网络层得到了广泛的应用,驱动层次上已经有 E1000 系列网卡,RTL8139 系列网卡,3c50X 系

Linux内核抢占实现机制分析【转】

Linux内核抢占实现机制分析转自:http://blog.chinaunix.net/uid-24227137-id-3050754.html [摘要]本文详解了Linux内核抢占实现机制.首先介绍了内核抢占和用户抢占的概念和区别,接着分析了不可抢占内核的特点及实时系统中实现内核抢占的必要性.然后分析了禁止内核抢占的情况和内核抢占的时机,最后介绍了实现抢占内核所做的改动以及何时需要重新调度. [关键字]内核抢占,用户抢占,中断, 实时性,自旋锁,抢占时机,调度时机,schedule,pree

Linux内核抢占实现机制分析

Sailor_forever [email protected] 转载请注明 http://blog.csdn.net/sailor_8318/archive/2008/09/03/2870184.aspx [摘要]本文详解了Linux内核抢占实现机制.首先介绍了内核抢占和用户抢占的概念和区别,接着分析了不可抢占内核的特点及实时系统中实现内核抢占的必要性.然后分析了禁止内核抢占的情况和内核抢占的时机,最后介绍了实现抢占内核所做的改动以及何时需要重新调度. [关键字]内核抢占,用户抢占,中断,

linux kernel的中断子系统之（七）：GIC代码分析

一.前言 GIC(Generic Interrupt Controller)是ARM公司提供的一个通用的中断控制器,其architecture specification目前有四个版本,V1-V4(V2最多支持8个ARM core,V3/V4支持更多的ARM core,主要用于ARM64服务器系统结构).目前在ARM官方网站只能下载到Version 2的GIC architecture specification,因此,本文主要描述符合V2规范的GIC硬件及其驱动. 具体GIC硬件的实现形态有两

Linux内核态抢占机制分析（转）

Linux内核态抢占机制分析 http://blog.sina.com.cn/s/blog_502c8cc401012pxj.html 摘要]本文首先介绍非抢占式内核(Non-Preemptive Kernel)和可抢占式内核(Preemptive Kernel)的区别.接着分析Linux下有两种抢占:用户态抢占(User Preemption).内核态抢占(Kernel Preemption).然后分析了在内核态下:如何判断能否抢占内核(什么是可抢占的条件):何时触发重新调度(何时设置可抢

Linux中断 - GIC代码分析

linux poll机制分析(二)

一.回顾在linux poll机制使用(一)写了个实现poll机制的简单例子.在驱动模块中需要实现struct file_operations的.poll成员.在驱动模块中xxx_poll函数的的作用是将当前进程添加到等待队列中:然后判断事件是否发生,发生则返回POLLIN | POLLRDNORM,否则返回0(可以看看上一章的例子):接下来分析一下 linux 内核中 poll 机制的实现. 二.poll机制分析 1.系统调用当应用层调用poll函数时,linux发生系统调用(系统调用入口