lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170808.181649.1810988621520678130.davem@davemloft.net>
Date:   Tue, 08 Aug 2017 18:16:49 -0700 (PDT)
From:   David Miller <davem@...emloft.net>
To:     jacob.e.keller@...el.com
Cc:     netdev@...r.kernel.org
Subject: Re: [RFC PATCH] net: don't set __LINK_STATE_START until after
 dev->open() call

From: Jacob Keller <jacob.e.keller@...el.com>
Date: Mon,  7 Aug 2017 15:24:21 -0700

> Fix an issue with relying on netif_running() which could be true during
> when dev->open() handler is being called, even if it would exit with
> a failure. This ensures the state does not get set and removed with
> a narrow race for other callers to read it as open when infact it never
> finished opening.
> 
> Signed-off-by: Jacob Keller <jacob.e.keller@...el.com>
> ---
> I found this as a result of debugging a race condition in the i40evf
> driver, in which we assumed that netif_running() would not be true until
> after dev->open() had been called and succeeded. Unfortunately we can't
> hold the rtnl_lock() while checking netif_running() because it would
> cause a deadlock between our reset task and our ndo_open handler.
> 
> I am wondering whether the proposed change is acceptable here, or
> whether some ndo_open handlers rely on __LINK_STATE_START being true
> prior to their being called?

I think this has the potential to break a bunch of drivers, but I
cannot prove this.

A lot of drivers have several pieces of state setup when they bring
the device up.  And these routines are also invoked from other code
paths like suspend/resume, PCI-E error recovery, etc. and they
probably do netif_running() calls here and there.

This behavior has been this way for a very long time, so the risk is
quite high I think.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ