lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sat, 16 Apr 2011 07:55:34 -0400 From: Neil Horman <nhorman@...driver.com> To: Eric Dumazet <eric.dumazet@...il.com> Cc: Stephen Hemminger <stephen.hemminger@...tta.com>, netdev@...r.kernel.org, davem@...emloft.net, Dimitris Michailidis <dm@...lsio.com>, Thomas Gleixner <tglx@...utronix.de>, David Howells <dhowells@...hat.com>, Tom Herbert <therbert@...gle.com>, Ben Hutchings <bhutchings@...arflare.com> Subject: Re: [PATCH 2/3] net: Add net device irq siloing feature On Sat, Apr 16, 2011 at 08:21:37AM +0200, Eric Dumazet wrote: > Le vendredi 15 avril 2011 à 21:52 -0700, Stephen Hemminger a écrit : > > > On Fri, Apr 15, 2011 at 11:49:03PM +0100, Ben Hutchings wrote: > > > > On Fri, 2011-04-15 at 16:17 -0400, Neil Horman wrote: > > > > > Using the irq affinity infrastrucuture, we can now allow net > > > > > devices to call > > > > > request_irq using a new wrapper function (request_net_irq), which > > > > > will attach a > > > > > common affinty_update handler to each requested irq. This affinty > > > > > update mechanism correlates each tracked irq to the flow(s) that > > > > > said irq processes > > > > > most frequently. The highest traffic flow is noted, marked and > > > > > exported to user > > > > > space via the affinity_hint proc file for each irq. In this way, > > > > > utilities like > > > > > irqbalance are able to determine which cpu is recieving the most > > > > > data from each > > > > > rx queue on a given NIC, and set irq affinity accordingly. > > > > [...] > > > > > > > > Is irqbalance expected to poll the affinity hints? How often? > > > > > > > Yes, its done just that for quite some time. Intel added that ability > > > at the > > > same time they added the affinity_hint proc file. Irqbalance polls the > > > affinity_hint file at the same time it rebalances all irqs (every 10 > > > seconds). If the affinity_hint is non-zero, irqbalance just copies it > > > to smp_affinity for > > > the same irq. Up until now thats been just about dead code because > > > only ixgbe > > > sets affinity_hint. Thats why I added the affinity_alg file, so > > > irqbalance could do something more intellegent than just a blind copy. > > > With the patch that > > > I referenced I added code to irqbalance to allow it to preform > > > different balancing methods based on the output of affinity_alg. > > > Neil > > > > I hate the way more and more interfaces are becoming device driver > > specific. It makes it impossible to build sane management infrastructure > > and causes lots of customer and service complaints. > > > > For me, the whole problem is the paradigm that we adapt IRQ to CPU were > applications _were_ running in last seconds, while process scheduler > might perform other choices, ie migrate task to cpu where IRQ was > happening (the cpu calling wakeups) > > We can add logic to each layer, and yet not gain perfect behavior. > > Some kind of cooperation is neeed. > > Irqbalance for example is of no use in the case of a network flood > happening on your machine, because we enter NAPI mode for several > minutes on a single cpu. We'll need to add special logic in NAPI loop to > force an exit to reschedule an IRQ (so that another cpu can take it) > Would you consider an approach whereby we, instead of updating irq affinity to match the process that consumes data from a given irq, bias the scheduler such that process which consume data from a given irq not be moved away from the same core/l2 cache being fed by that flow? Do you have a suggestion for how best to communicate that to the scheduler? It would seem that interrogating the RFS table from the scheduler might not be well received. Best Neil > > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists