lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20211123051612.GA4009@hsiangkao-HP-ZHAN-66-Pro-G1>
Date:   Tue, 23 Nov 2021 13:16:13 +0800
From:   Gao Xiang <xiang@...nel.org>
To:     Huang Jianan <huangjianan@...o.com>,
        Jianhua1 Hao 郝建华 <haojianhua1@...omi.com>
Cc:     "xiang@...nel.org" <xiang@...nel.org>,
        linux-erofs <linux-erofs@...ts.ozlabs.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        chao <chao@...nel.org>, guoweichao@...o.com, guanyuwei@...o.com,
        "yh@...o.com" <yh@...o.com>, zhangshiming@...o.com
Subject: Re: [PATCH] erofs: Deadlock caused by kswap work in low memory
 scenarios

Hi Jianan and Jianhua,

On Tue, Nov 23, 2021 at 11:58:32AM +0800, Huang Jianan wrote:
> 在 2021/11/23 10:59, Jianhua1 Hao 郝建华 via Linux-erofs 写道:
> > *We also found that it is easy to cause deadlock in the kswap scene, We
> > observed the following deadlock in the stress test under low memory
> > scenario,****Same as "erofs: fix deadlock when shrink erofs slab".*
> > **
> > 
> > Thread A: Thread B:
> > 
> > erofs_try_to_release_workgroup(grp =
> > 0xFFFFFF87ADFEE610)erofs_insert_workgroup()
> > 
> > erofs_workgroup_try_to_freeze(grp, 1)//xa lock is held here
> > 
> > //set ref count to EROFS_LOCKED_MAGICxa_lock(&sbi->managed_pslots);
> > 
> > atomic_cmpxchg(&grp->refcount, val,EROFS_LOCKED_MAGIC)pre =
> > __xa_cmpxchg(&sbi->managed_pslots, grp->index, NULL, grp, GFP_NOFS);
> > 
> > xa_erase(&sbi->managed_pslots, grp->index)erofs_workgroup_get(pre) 
> > //pre = grp = 0xFFFFFF87ADFEE610
> > 
> > //stuck there to wait for xa lock, already held by thread
> > Berofs_wait_on_workgroup_freezed(grp);
> > 
> > xa_lock(xa); //wait ref count to be unlocked, which should be done by
> > thread A
> > 
> > atomic_cond_read_relaxed(&grp->refcount, VAL != EROFS_LOCKED_MAGIC);
> > 
> > Follow-up fix:it need to hold the xa lock before freeze the workgroup
> > 
> > beacuse we will operate xarry?
> > 
> Hi,  JianHua,
> 
> The fix is in the patch, please test it kindly if you have condition.
> https://lore.kernel.org/linux-erofs/YZcJpDs3FKpSfzAE@B-P7TQMD6M-0146/T/#t

Thanks for the report, I had some other work to do just now.

I've pushed out this patch to fixes branch and will send to Linus this
week:
https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git/commit/?id=deccd444d2844f1e89314dfc3956cccfdb813b65

As Jianan said, I believe this patch can fix your issue and please take
a try in advance. Also, it doesn't effect v4.19 and v5.4 LTS, only v5.10
and v5.15 LTS are impacted.

Thanks for your report!
Gao Xiang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ