netdev - Re: RFC: issues concerning the next NAPI interface

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20070824085203.42f4305c@freepuppy.rosehill.hemminger.net>
Date:	Fri, 24 Aug 2007 08:52:03 -0700
From:	Stephen Hemminger <shemminger@...ux-foundation.org>
To:	Jan-Bernd Themann <ossthema@...ibm.com>
Cc:	akepner@....com, netdev <netdev@...r.kernel.org>,
	Christoph Raisch <raisch@...ibm.com>,
	Jan-Bernd Themann <themann@...ibm.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	linux-ppc <linuxppc-dev@...abs.org>,
	Marcus Eder <meder@...ibm.com>,
	Thomas Klein <tklein@...ibm.com>,
	Stefan Roscher <stefan.roscher@...ibm.com>
Subject: Re: RFC: issues concerning the next NAPI interface

On Fri, 24 Aug 2007 17:47:15 +0200
Jan-Bernd Themann <ossthema@...ibm.com> wrote:

> Hi,
> 
> On Friday 24 August 2007 17:37, akepner@....com wrote:
> > On Fri, Aug 24, 2007 at 03:59:16PM +0200, Jan-Bernd Themann wrote:
> > > .......
> > > 3) On modern systems the incoming packets are processed very fast. Especially
> > >    on SMP systems when we use multiple queues we process only a few packets
> > >    per napi poll cycle. So NAPI does not work very well here and the interrupt 
> > >    rate is still high. What we need would be some sort of timer polling mode 
> > >    which will schedule a device after a certain amount of time for high load 
> > >    situations. With high precision timers this could work well. Current
> > >    usual timers are too slow. A finer granularity would be needed to keep the
> > >    latency down (and queue length moderate).
> > > 
> > 
> > We found the same on ia64-sn systems with tg3 a couple of years 
> > ago. Using simple interrupt coalescing ("don't interrupt until 
> > you've received N packets or M usecs have elapsed") worked 
> > reasonably well in practice. If your h/w supports that (and I'd 
> > guess it does, since it's such a simple thing), you might try 
> > it.
> > 
> 
> I don't see how this should work. Our latest machines are fast enough that they
> simply empty the queue during the first poll iteration (in most cases).
> Even if you wait until X packets have been received, it does not help for
> the next poll cycle. The average number of packets we process per poll queue
> is low. So a timer would be preferable that periodically polls the 
> queue, without the need of generating a HW interrupt. This would allow us
> to wait until a reasonable amount of packets have been received in the meantime
> to keep the poll overhead low. This would also be useful in combination
> with LRO.
> 

You need hardware support for deferred interrupts. Most devices have it (e1000, sky2, tg3)
and it interacts well with NAPI. It is not a generic thing you want done by the stack,
you want the hardware to hold off interrupts until X packets or Y usecs have expired.

The parameters for controlling it are already in ethtool, the issue is finding a good
default set of values for a wide range of applications and architectures. Maybe some
heuristic based on processor speed would be a good starting point. The dynamic irq
moderation stuff is not widely used because it is too hard to get right.

-- 
Stephen Hemminger <shemminger@...ux-foundation.org>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html