lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 21 Sep 2015 21:52:30 +0100
From:	David Woodhouse <dwmw2@...radead.org>
To:	Francois Romieu <romieu@...zoreil.com>
Cc:	netdev@...r.kernel.org
Subject: Re: [PATCH 1/7] 8139cp: Improve accuracy of cp_interrupt() return,
 to survive IRQ storms

On Mon, 2015-09-21 at 22:25 +0200, Francois Romieu wrote:
> David Woodhouse <dwmw2@...radead.org> :
> > From: David Woodhouse <David.Woodhouse@...el.com>
> > 
> > The TX timeout handling has been observed to trigger RX IRQ storms. And
> > since cp_interrupt() just keeps saying that it handled the interrupt,
> > the machine then dies. Fix the return value from cp_interrupt(), and
> > the offending IRQ gets disabled and the machine survives.
> 
> I am not fond of the way it dissociates the hardware status word and the
> software "handled" variable.

Oh, I like that part very much :)

The practice of returning a 'handled' status only when you've actually
*done* something you expect to mitigate the interrupt, is a useful way
of also protecting against both hardware misbehaviour and software
bugs.

> What you are describing - RX IRQ storms - looks like a problem between
> the irq and poll handlers. That's where I expect the problem to be solved.

I already fixed that, in the next patch in the series. But the failure
mode *should* have been 'IRQ disabled' and the device continuing to
work via polling. Not a complete death of the machine. That's the
difference this patch makes.

> Sprinkling "handled" operations does not make me terribly confortable,
> especially as I'd happily trade the old-style part irq, part napi
> processing for a plain napi processing (I can get over it though :o) ).

The existing cp_rx_poll() function mostly runs without taking cp->lock.
But cp_tx() *does* need cp->lock for the tx_head/tx_tail manipulations.
I'm guessing that's why it's still being called directly from the hard
IRQ handler? I suppose we could probably find a way to move that out.

But I was mostly just trying to fix stuff that was actually broken...
:)

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@...el.com                              Intel Corporation


Download attachment "smime.p7s" of type "application/x-pkcs7-signature" (5691 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ