lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 26 Jan 2018 12:12:00 +0900
From:   Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
To:     Eric Wheeler <linux-mm@...ts.ewheeler.net>
Cc:     Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        mhocko@...nel.org, hannes@...xchg.org, minchan@...nel.org,
        ying.huang@...el.com, mgorman@...hsingularity.net,
        vdavydov.dev@...il.com, akpm@...ux-foundation.org,
        shakeelb@...gle.com, gthelen@...gle.com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] mm,vmscan: Kill global shrinker lock.

Eric Wheeler wrote:
> Hi Tetsuo,
> 
> Thank you for looking into this!
> 
> I tried running this C program in 4.14.15 but did not get a deadlock, just 
> OOM kills. Is the patch required to induce the deadlock?

This reproducer must not trigger actual deadlock. Running this reproducer
with this patch applied causes lockdep warning. I just tried to suggest
possibility that making shrink_slab() suddenly no-op might cause unexpected
results. We still don't know what is happening in your case.

> 
> Also, what are you doing to XFS to make it trigger?

Nothing.



Would you answer to Michal's questions

  Is this a permanent state or does the holder eventually releases the lock?

  Do you remember the last good kernel?

and my guess

  Since commit 0bcac06f27d75285 was not backported to 4.14-stable kernel,
  this is unlikely the bug introduced by 0bcac06f27d75285 unless Eric
  explicitly backported 0bcac06f27d75285.

?

Can you take SysRq-t (e.g. "echo t > /proc/sysrq-trigger") when processes
got stuck? I think that we need to know what other threads are doing when
__lock_page() is waiting in order to distinguish "somebody forgot to unlock
the page" and "somebody is still doing something (e.g. waiting for memory
allocation) in order to unlock the page".

If you can take SysRq-t, taking SysRq-t with
http://lkml.kernel.org/r/1510833448-19918-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
applied and built with CONFIG_DEBUG_SHOW_MEMALLOC_LINE=y should give us
more clues (e.g. how long threads are waiting for memory allocation).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ