lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <61041c56-f7d2-49f8-9fc3-57852a96105a@lunn.ch>
Date:   Wed, 12 Apr 2023 19:15:41 +0200
From:   Andrew Lunn <andrew@...n.ch>
To:     Edward Cree <ecree.xilinx@...il.com>
Cc:     edward.cree@....com, linux-net-drivers@....com,
        davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com,
        edumazet@...gle.com, netdev@...r.kernel.org,
        habetsm.xilinx@...il.com, sudheer.mogilappagari@...el.com
Subject: Re: [RFC PATCH v2 net-next 6/7] net: ethtool: add a mutex protecting
 RSS contexts

On Wed, Apr 12, 2023 at 05:16:11PM +0100, Edward Cree wrote:
> On 11/04/2023 21:40, Andrew Lunn wrote:
> > On Tue, Apr 11, 2023 at 07:26:14PM +0100, edward.cree@....com wrote:
> >> While this is not needed to serialise the ethtool entry points (which
> >>  are all under RTNL), drivers may have cause to asynchronously access
> >>  dev->ethtool->rss_ctx; taking dev->ethtool->rss_lock allows them to
> >>  do this safely without needing to take the RTNL.
> > 
> > What is actually wrong with taking RTNL? KISS is often best,
> > especially for locks.
> 
> The examples I have of driver code that needs to access rss_ctx (in the
>  sfc driver) are deep inside call chains where RTNL may or may not
>  already be held.
> 1) filter insertion.  E.g. ethtool -U is already holding RTNL, but other
>  sources of filters (e.g. aRFS) aren't, and thus taking it if necessary
>  might mean passing a 'bool rtnl_locked' all the way down the chain.
> 2) device reset handling (we have to restore the RSS contexts in the
>  hardware after a reset).  Again resets don't always happen under RTNL,
>  and I don't fully understand the details (EFX_ASSERT_RESET_SERIALISED()).
> So it makes life much simpler if we just have a finer-grained lock we can
>  just take when we need to.
> Also, RTNL is a very big hammer that serialises all kinds of operations
>  across all the netdevs in the system, holding it for any length of time
>  can cause annoying user-visible latency (e.g. iirc sshd accepting a new
>  connection has to wait for it) so I prefer to avoid it if possible.  If
>  anything we want to be breaking up this BKL[1], not making it bigger.

Hi Ed

I have to wonder if your locking model is wrong. When i look at the
next patch, i see the driver is also using this lock. And i generally
find that is wrong.

As a rule of thumb, driver writes don't understand locking. Yes, there
are some that do, but most don't. As core code developers, i find it
good practice to have the locks in the core, and only in the
core. Drivers writers should not need to worry about locking. The API
into the driver will take the locks needed before entering the driver,
and release them on exit.

So i don't really agree with 'it makes life much simpler if we just
have a finer-grained lock'. It makes life more complex having to help
driver writers find the corruption and deadlock bugs in their code
because they got the locking wrong.

Please try to work on the abstraction so that drivers don't need this
lock, just the core.

	Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ