lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Mon, 1 Mar 2010 12:46:26 -0800
From:	Tom Herbert <therbert@...gle.com>
To:	Ben Hutchings <bhutchings@...arflare.com>
Cc:	netdev <netdev@...r.kernel.org>,
	Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@...el.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Stephen Hemminger <shemminger@...tta.com>,
	sf-linux-drivers <linux-net-drivers@...arflare.com>
Subject: Re: [RFC] Setting processor affinity for network queues

On Mon, Mar 1, 2010 at 9:21 AM, Ben Hutchings <bhutchings@...arflare.com> wrote:
> With multiqueue network hardware or Receive/Transmit Packet Steering
> (RPS/XPS) we can spread out network processing across multiple
> processors.  The administrator should be able to control the number of
> channels and the processor affinity of each.
>
> By 'channel' I mean a bundle of:
> - a wakeup (IRQ or IPI)
> - a receive queue whose completions trigger the wakeup
> - a transmit queue whose completions trigger the wakeup
> - a NAPI instance scheduled by the wakeup, which handles the completions
>

Yes.  Also in the receive side it is really cumbersome to do per NAPI
RPS settings without the receive napi instance not be exposed in
netif_rx.  Maybe a reference to NAPI structure can be added in skb?
This could clean up RPS a lot.

Tom

> Numbers of RX and TX queues used on a device do not have to match, but
> ideally they should.  For generality, you can subsitute 'a receive
> and/or a transmit queue' above.  At the hardware level the numbers of
> queues could be different e.g. in the sfc driver a channel would be
> associated with 1 hardware RX queue, 2 hardware TX queues (with and
> without checksum offload) and 1 hardware event queue.
>
> Currently we have a userspace interface for setting affinity of IRQs and
> a convention for naming each channel's IRQ handler, but no such
> interface for memory allocation.  For RX buffers this should not be a
> problem since they are normally allocated as older buffers are
> completed, in the NAPI context.  However, the DMA descriptor rings and
> driver structures for a channel should also be allocated on the NUMA
> node where NAPI processing is done.  Currently this allocation takes
> place when a net device is created or when it is opened, before an
> administrator has any opportunity to configure affinity.  Reallocation
> will normally require a complete stop to network traffic (at least on
> the affected queues) so it should not be done automatically when the
> driver detects a change in IRQ affinity.  There needs to be an explicit
> mechanism for changing it.
>
> Devices using RPS will not generally be able to implement NUMA affinity
> for RX buffer allocation, but there will be a similar issue of processor
> selection for IPIs and NUMA node affinity for driver structures.  The
> proposed interface for setting processor affinity should cover this, but
> it is completely different from the IRQ affinity mechanism for hardware
> multiqueue devices.  That seems undesirable.
>
> Therefore I propose that:
>
> 1. Channels (or NAPI instances) should be exposed in sysfs.
> 2. Channels will have processor affinity, exposed read/write in sysfs.
> Changing this triggers the networking core and driver to reallocate
> associated structures if the processor affinity moved between NUMA
> nodes, and triggers the driver to set IRQ affinity.
> 3. The networking core will set the initial affinity for each channel.
> There may be global settings to control this.
> 4. Drivers should not set IRQ affinity.
> 5. irqbalanced should not set IRQ affinity for multiqueue network
> devices.
>
> (Most of this has been proposed already, but I'm trying to bring it all
> together.)
>
> Ben.
>
> --
> Ben Hutchings, Senior Software Engineer, Solarflare Communications
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ