lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 27 Feb 2017 21:08:10 -0500 (EST)
From:   David Miller <davem@...emloft.net>
To:     eric.dumazet@...il.com
Cc:     netdev@...r.kernel.org, tariqt@...lanox.com, saeedm@...lanox.com
Subject: Re: [PATCH v2 net] net: solve a NAPI race

From: Eric Dumazet <eric.dumazet@...il.com>
Date: Mon, 27 Feb 2017 08:44:14 -0800

> Any point doing a napi_schedule() not from device hard irq handler
> is subject to the race for NIC using some kind of edge trigger
> interrupts.
> 
> Since we do not provide a ndo to disable device interrupts, the
> following can happen.

Ok, now I understand.

I think even without considering the race you are trying to solve,
this situation is really dangerous.

I am sure that every ->poll() handler out there was written by an
author who completely assumed that if they are executing then the
device's interrupts for that NAPI instance are disabled.  And this is
with very few, if any, exceptions.

So if we saw a driver doing something like:

	reg->irq_enable ^= value;

after napi_complete_done(), it would be quite understandable.

We really made a mistake taking the napi_schedule() call out of
the domain of the driver so that it could manage the interrupt
state properly.

I'm not against your missed bit fix as a short-term cure for now, it's
just that somewhere down the road we need to manage the interrupt
properly.

Powered by blists - more mailing lists