netdev - Re: [PATCH] Revert "net: linkwatch: add check for netdevice being present to linkwatch_do

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20200923.172125.1341776337290371000.davem@davemloft.net>
Date:   Wed, 23 Sep 2020 17:21:25 -0700 (PDT)
From:   David Miller <davem@...emloft.net>
To:     saeed@...nel.org
Cc:     hkallweit1@...il.com, geert+renesas@...der.be,
        f.fainelli@...il.com, andrew@...n.ch, kuba@...nel.org,
        gaku.inami.xh@...esas.com, yoshihiro.shimoda.uh@...esas.com,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] Revert "net: linkwatch: add check for netdevice being
 present to linkwatch_do_dev"

From: Saeed Mahameed <saeed@...nel.org>
Date: Wed, 23 Sep 2020 15:42:17 -0700

> Maybe we need to clear IFF_UP before calling ops->ndo_stop(dev),
> instead of after on __dev_close_many(). Assuming no driver is checking
> IFF_UP state on its own ndo_stop(), other than this, the order
> shouldn't really matter, since clearing the flag and calling ndo_stop()
> should be considered as one atomic operation.

This is my biggest concern, that some ndo_stop, or some helper called
by ndo_stop, checks IFF_UP or similar.

There is also something else.  We have both synchronous and async code
that checks state like IFF_UP and 'present' and makes a decision based
upon that.

If an async code path tests 'present', gets true, and then the RTNL
holding synchronous code path puts the device into D3hot immediately
afterwards, the async code path will still continue and access the
chips registers and fault.

I'm saying all of this because the only way this bug makes sense
is if the ->ndo_stop() sequence that marks the device !present and
then clears IFF_UP runs with the RTNL mutex held, and the code path
that tests this state in the linkwatch bits in question do not.