lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 29 Jun 2022 14:30:07 +0300
From:   Ido Schimmel <idosch@...dia.com>
To:     Jiri Pirko <jiri@...nulli.us>
Cc:     netdev@...r.kernel.org, davem@...emloft.net, kuba@...nel.org,
        petrm@...dia.com, pabeni@...hat.com, edumazet@...gle.com,
        mlxsw@...dia.com, saeedm@...dia.com
Subject: Re: [patch net-next RFC 0/2] net: devlink: remove devlink big lock

On Wed, Jun 29, 2022 at 12:36:24PM +0200, Jiri Pirko wrote:
> Wed, Jun 29, 2022 at 12:25:49PM CEST, jiri@...nulli.us wrote:
> >Tue, Jun 28, 2022 at 09:43:26AM CEST, idosch@...dia.com wrote:
> >>On Mon, Jun 27, 2022 at 05:55:06PM +0200, Jiri Pirko wrote:
> >>> Mon, Jun 27, 2022 at 05:41:31PM CEST, idosch@...dia.com wrote:
> >>> >On Mon, Jun 27, 2022 at 03:54:59PM +0200, Jiri Pirko wrote:
> >>> >> From: Jiri Pirko <jiri@...dia.com>
> >>> >> 
> >>> >> This is an attempt to remove use of devlink_mutex. This is a global lock
> >>> >> taken for every user command. That causes that long operations performed
> >>> >> on one devlink instance (like flash update) are blocking other
> >>> >> operations on different instances.
> >>> >
> >>> >This patchset is supposed to prevent one devlink instance from blocking
> >>> >another? Devlink does not enable "parallel_ops", which means that the
> >>> >generic netlink mutex is serializing all user space operations. AFAICT,
> >>> >this series does not enable "parallel_ops", so I'm not sure what
> >>> >difference the removal of the devlink mutex makes.
> >>> 
> >>> You are correct, that is missing. For me, as a side effect this patchset
> >>> resolved the deadlock for LC auxdev you pointed out. That was my
> >>> motivation for this patchset :)
> >>
> >>Given that devlink does not enable "parallel_ops" and that the generic
> >>netlink mutex is held throughout all callbacks, what prevents you from
> >>simply dropping the devlink mutex now? IOW, why can't this series be
> >>patch #1 and another patch that removes the devlink mutex?
> >
> >Yep, I think you are correct. We are currently working with Moshe on
> 
> Okay, I see the problem with what you suggested:
> devlink_pernet_pre_exit()
> There, devlink_mutex is taken to protect against simultaneous cmds
> from being executed. That will be fixed with reload conversion to take
> devlink->lock.

OK, so this lock does not actually protect against simultaneous user
space operations (this is handled by the generic netlink mutex).
Instead, it protects against user space operations during netns
dismantle.

IIUC, the current plan is:

1. Get the devlink->lock rework done. Devlink will hold the lock for
every operation invocation and drivers will hold it while calling into
devlink via devl_lock().

This means 'DEVLINK_NL_FLAG_NO_LOCK' is removed and the lock will also
be taken in netns dismantle.

2. At this stage, the devlink mutex is only taken in devlink_register()
/ devlink_unregister() and some form of patch #1 will take care of that
so that this mutex can be removed.

3. Enable "parallel_ops"

?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ