[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a7ff1afd2e1fc2232103ceb9aa763064daf90212.camel@kernel.org>
Date: Wed, 23 Sep 2020 15:42:17 -0700
From: Saeed Mahameed <saeed@...nel.org>
To: Heiner Kallweit <hkallweit1@...il.com>,
David Miller <davem@...emloft.net>
Cc: geert+renesas@...der.be, f.fainelli@...il.com, andrew@...n.ch,
kuba@...nel.org, gaku.inami.xh@...esas.com,
yoshihiro.shimoda.uh@...esas.com, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] Revert "net: linkwatch: add check for netdevice being
present to linkwatch_do_dev"
On Wed, 2020-09-23 at 22:44 +0200, Heiner Kallweit wrote:
> On 23.09.2020 22:15, David Miller wrote:
> > From: Heiner Kallweit <hkallweit1@...il.com>
> > Date: Wed, 23 Sep 2020 21:58:59 +0200
> >
> > > On 23.09.2020 20:35, Saeed Mahameed wrote:
> > > > Why would a driver detach the device on ndo_stop() ?
> > > > seems like this is the bug you need to be chasing ..
> > > > which driver is doing this ?
> > > >
> > > Some drivers set the device to PCI D3hot at the end of ndo_stop()
> > > to save power (using e.g. Runtime PM). Marking the device as
> > > detached
> > > makes clear to to the net core that the device isn't accessible
> > > any
> > > longer.
> >
> > That being the case, the problem is that IFF_UP+!present is not a
> > valid netdev state.
> >
> If this combination is invalid, then netif_device_detach() should
> clear IFF_UP? At a first glance this should be sufficient to avoid
> the issue I was dealing with.
>
Feels like a work around and would conflict with the assumption that
netif_device_detach() should only be called when !IFF_UP
Maybe we need to clear IFF_UP before calling ops->ndo_stop(dev),
instead of after on __dev_close_many(). Assuming no driver is checking
IFF_UP state on its own ndo_stop(), other than this, the order
shouldn't really matter, since clearing the flag and calling ndo_stop()
should be considered as one atomic operation.
> > Is it simply the issue that, upon resume, IFF_UP is marked true
> > before
> > the device is brought out from D3hot state and thus marked as
> > present
> > again?
> >
> I can't really comment on that. The issue I was dealing with at the
> time I submitted this change was about an async linkwatch event
> (caused by powering down the PHY in ndo_stop) trying to access the
> device when it was powered down already.
Powered by blists - more mailing lists