netdev - Irq architecture for multi-core network driver.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <4AE0D14B.1070307@caviumnetworks.com>
Date:	Thu, 22 Oct 2009 14:40:27 -0700
From:	David Daney <ddaney@...iumnetworks.com>
To:	netdev@...r.kernel.org,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
CC:	linux-mips <linux-mips@...ux-mips.org>
Subject: Irq architecture for multi-core network driver. 

My network controller is part of a multicore SOC family[1] with up to 32 
cpu cores.

The the packets-ready signal from the network controller can trigger
an interrupt on any or all cpus and is configurable on a per cpu basis.

If more than one cpu has the interrupt enabled, they would all get the
interrupt, so if a single packet were to be ready, all cpus could be
interrupted and try to process it.  The kernel interrupt management
functions don't seem to give me a good way to manage the interrupts.
More on this later.

My current approach is to add a NAPI instance for each cpu.  I start
with the interrupt enabled on a single cpu, when the interrupt
triggers, I mask the interrupt on that cpu and schedule the
napi_poll.  When the napi_poll function is entered, I look at the
packet backlog and if it is above a threshold , I enable the interrupt
on an additional cpu.  The process then iterates until the number of cpu
running the napi_poll function can maintain the backlog under the
threshold.  This all seems to work fairly well.

The main problem I have encountered is how to fit the interrupt
management into the kernel framework.  Currently the interrupt source
is connected to a single irq number.  I request_irq, and then manage
the masking and unmasking on a per cpu basis by directly manipulating
the interrupt controller's affinity/routing registers.  This goes
behind the back of all the kernel's standard interrupt management
routines.  I am looking for a better approach.

One thing that comes to mind is that I could assign a different
interrupt number per cpu to the interrupt signal.  So instead of
having one irq I would have 32 of them.  The driver would then do
request_irq for all 32 irqs, and could call enable_irq and disable_irq
to enable and disable them.  The problem with this is that there isn't
really a single packets-ready signal, but instead 16 of them.  So If I
go this route I would have 16(lines) x 32(cpus) = 512 interrupt
numbers just for the networking hardware, which seems a bit excessive.

A second possibility is to add something like:

int irq_add_affinity(unsigned int irq, cpumask_t cpumask);

int irq_remove_affinity(unsigned int irq, cpumask_t cpumask);

These would atomically add and remove cpus from an irq's affinity.
This is essentially what my current driver does, but it would be with
a new officially blessed kernel interface.

Any opinions about the best way forward are most welcome.

Thanks,
David Daney

[1]: See: arch/mips/cavium-octeon and drivers/staging/octeon.  Yes the 
staging driver is ugly, I am working to improve it.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html