lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240104081656.67c6030c@kernel.org>
Date: Thu, 4 Jan 2024 08:16:56 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Heiner Kallweit <hkallweit1@...il.com>
Cc: Stanislaw Gruszka <stanislaw.gruszka@...ux.intel.com>, Johannes Berg
 <johannes@...solutions.net>, netdev@...r.kernel.org, Johannes Berg
 <johannes.berg@...el.com>, Marc MERLIN <marc@...lins.org>, Przemek Kitszel
 <przemyslaw.kitszel@...el.com>
Subject: Re: [PATCH net v3] net: ethtool: do runtime PM outside RTNL

On Thu, 4 Jan 2024 10:05:12 +0100 Heiner Kallweit wrote:
> > If device was not suspended, pm_runtime_get_sync() will increase
> > dev->power.usage_count counter and cancel pending rpm suspend
> > request if any. There is race condition though, more about that
> > below.
> > 
> > If device was suspended, we could not get to igc_open() since it
> > was marked as detached and fail netif_device_present() check in
> > __dev_open(). That was the behaviour before bd869245a3dc.

__dev_open() tries to resume as well, and is also under rtnl_lock.
So that resume call somehow must never happen or users would see
-ENODEV? Sorry for the basic questions, the flow is confusing :S

> > There is small race window between with igc_open() and scheduled
> > runtime suspend, if at the same time dev_open() is done and
> > dev->power.suspend_timer expire:
> > 
> > open:					pm_suspend_timer_fh:
> > 
> > rtnl_lock()
> > 					rpm_suspend()
> > 					  igc_runtime_suspend()
> > 					   __igc_shutdown()
> > 					     rtnl_lock()
> > 
> > __igc_open()
> >   pm_runtime_get_sync():
> >     waits for rpm suspend callback done
> > 
> > This needs to be addressed, but it's not that this can happen
> > all the time. To trigger this someone has to remove the
> > cable and exactly after 5 seconds do ip link set up. 

Or tries to up exactly 5 sec after probe?

> For me the main question is the following. In igc_resume() you have
> 
> 	rtnl_lock();
> 	if (!err && netif_running(netdev))
> 		err = __igc_open(netdev, true);
> 
> 	if (!err)
> 		netif_device_attach(netdev);
> 	rtnl_unlock();
> 
> Why is the global rtnl_lock() needed here? The netdev is in detached
> state what protects from e.g. userspace activity, see all the
> netif_device_present() checks in net core.

That'd assume there are no RPM calls outside networking in this driver.
Perhaps there aren't but that also sounds wobbly.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ