lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20200312214159.1c41209d@kicinski-fedora-PC1C0HJN>
Date:   Thu, 12 Mar 2020 21:41:59 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     Shannon Nelson <snelson@...sando.io>
Cc:     David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: [PATCH net-next 1/7] ionic: tx and rx queues state follows link
 state

On Thu, 12 Mar 2020 17:12:45 -0700 Shannon Nelson wrote:
> On 3/12/20 3:41 PM, David Miller wrote:
> > From: Shannon Nelson <snelson@...sando.io>
> > Date: Thu, 12 Mar 2020 14:50:09 -0700
> >  
> >> +		if (!test_bit(IONIC_LIF_F_UP, lif->state) &&
> >> +		    netif_running(netdev)) {
> >> +			rtnl_lock();
> >> +			ionic_open(netdev);
> >> +			rtnl_unlock();
> >>   		}  
> > You're running into a major problem area here.
> >
> > ionic_open() can fail, particularly because it allocates resources.
> >
> > Yet you are doing this in an operational path that doesn't handle
> > and unwind from errors.
> >
> > You must find a way to do this properly, because the current approach
> > can result in an inoperable interface.  
> 
> I don't see this as much different from how we use it in 
> ionic_reset_queues(), which was modeled after some other drivers' uses 
> of the open call.  In the fw reset case, though, the time between the 
> close and the open is many seconds.

Precedent does not make it any better. You're really pushing the
re-open hack to new levels here :(

Why don't you just unregister the netdev?  30 sec is orders of
magnitude past the point where "no impact to the user" could be
claimed.

FWIW please take a look at NFP, nfp_net_ring_reconfig() and its uses,
to see a better way of handling shutdown and reconfiguration.

> Yes, ionic_open() can fail, and it unwinds its own work.  There isn't 
> anything here in ionic_link_status_check() to unwind, and no one to 
> report the error to, so we don't catch the error here. However, it would 
> be better if I move the addition of the IONIC_LIF_F_TRANS flag and a 
> couple other bits from patch 7 into this patch - I can do that for a v2 
> patchset.

Those layers of "how open is the device _really_" state within drivers
are just a breeding ground for bugs. Already you're doing this:

+		if (!test_bit(IONIC_LIF_F_UP, lif->state) &&
+		    netif_running(netdev)) {
+			rtnl_lock();
+			ionic_open(netdev);
+			rtnl_unlock();

A lot may happen right before rtnl_lock().

You got the UP bit, the RESET bit, and now TRANS bit..

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ