lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1hbvu45jq.fsf@fess.ebiederm.org>
Date:	Wed, 26 Aug 2009 14:40:57 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Francois Romieu <romieu@...zoreil.com>
Cc:	David Dillow <dave@...dillows.org>,
	Michael Riepe <michael.riepe@...glemail.com>,
	Michael Buesch <mb@...sch.de>,
	Rui Santos <rsantos@...popie.com>,
	Michael B??ker <m.bueker@...lin.de>,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH] r8169: Reduce looping in the interrupt handler.

Francois Romieu <romieu@...zoreil.com> writes:

> Eric W. Biederman <ebiederm@...ssion.com> :
> [...]
>> It is a bit weird but it also means we aren't playing silly games
>> with status inside the loop.  So if we go through the loop we ack
>> everything in status.
>
> I fear we have some longstanding problem anyway :
>
> 1. quiescent state
> 2. packets are received
> 3. rtl8169_interrupt schedules napi, clears IntrStatus and exits
> 4. packets are received and some non-napi event happens
> 5. rtl8169_interrupt wakes up, reads IntrStatus and goes on...
> 6. rtl8169_poll wakes up, processes Rx and Tx napi events and goes on...
> 7. tp->intr_mask still equals ~tp->napi_event : rtl8169_interrupt
>    handler does not even try to schedule napi.
> 8. more packets are received
> 9. rtl8169_interrupt clears IntrStatus
> a. rtl8169_poll reenables napi scheduling, updates IntrMask and exits
> b. rtl8169_interrupt reads a perfectly clean IntrStatus and exits

That would not surprise me.

Right now I really don't have much more test bandwidth.  So I tried
for something simple that would address my problem without
fundamentally changing the already tested logic.  I am not seeing any
of the weird corner cases where we get confused.  The changes to fix
that problem is totally killing my ability to use the NIC, because it
loops way to much.

Perhaps we should unconditionally ack everything after changing the 
interrupt mask?  If that would prevent races it sounds like a simple fix.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ