lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <287160b0eba2b2c5e7fe8e1df95ed2ddf077311c.camel@kernel.org>
Date:   Tue, 06 Jul 2021 13:03:12 -0400
From:   Jeff Layton <jlayton@...nel.org>
To:     Luis Henriques <lhenriques@...e.de>,
        Ilya Dryomov <idryomov@...il.com>
Cc:     ceph-devel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 0/2] ceph_check_delayed_caps() softlockup

On Tue, 2021-07-06 at 14:52 +0100, Luis Henriques wrote:
> * changes since v3:
>   - always round the delay with round_jiffies_relative() in function
>     schedule_delayed() (patch 0001)
> 
> This is an attempt to fix the softlock on the delayed_work workqueue.  As
> stated in 0002 patch:
> 
>   Function ceph_check_delayed_caps() is called from the mdsc->delayed_work
>   workqueue and it can be kept looping for quite some time if caps keep being
>   added back to the mdsc->cap_delay_list.  This may result in the watchdog
>   tainting the kernel with the softlockup flag.
> 
> v2 of this fix modifies the approach by time-bounding the loop in this
> function, so that any caps added to the list *after* the loop starts will
> be postponed to the next wq run.
> 
> An extra change in 0001 (suggested by Jeff) allows scheduling runs for
> periods smaller than the default (5 secs) period.  This way,
> delayed_work() can have the next run scheduled for the next list element
> ci->i_hold_caps_max instead of 5 secs.
> 
> This patchset should fix the issue reported here [1], although a quick
> search for "ceph_check_delayed_caps" in the tracker returns a few more
> bugs, possibly duplicates.
> 
> [1] https://tracker.ceph.com/issues/46284
> 
> Luis Henriques (2):
>   ceph: allow schedule_delayed() callers to set delay for workqueue
>   ceph: reduce contention in ceph_check_delayed_caps()
> 
>  fs/ceph/caps.c       | 17 ++++++++++++++++-
>  fs/ceph/mds_client.c | 25 ++++++++++++++++---------
>  fs/ceph/super.h      |  2 +-
>  3 files changed, 33 insertions(+), 11 deletions(-)
> 

Looks good. I'll do some testing with this today and will merge into
testing branch if all goes well.

Thanks!
-- 
Jeff Layton <jlayton@...nel.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ