lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LSU.2.11.1702161702490.24224@eggly.anvils>
Date:   Thu, 16 Feb 2017 17:46:44 -0800 (PST)
From:   Hugh Dickins <hughd@...gle.com>
To:     Tim Chen <tim.c.chen@...ux.intel.com>
cc:     Hugh Dickins <hughd@...gle.com>,
        "Huang, Ying" <ying.huang@...el.com>,
        Minchan Kim <minchan@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: swap_cluster_info lockdep splat

On Thu, 16 Feb 2017, Tim Chen wrote:
> 
> > I do not understand your zest for putting wrappers around every little
> > thing, making it all harder to follow than it need be.  Here's the patch
> > I've been running with (but you have a leak somewhere, and I don't have
> > time to search out and fix it: please try sustained swapping and swapoff).
> > 
> 
> Hugh, trying to duplicate your test case.  So you were doing swapping,
> then swap off, swap on the swap device and restart swapping?

Repeated pair of make -j20 kernel builds in 700M RAM, 1.5G swap on SSD,
8 cpus; one of the builds in tmpfs, other in ext4 on loop on tmpfs file;
sizes tuned for plenty of swapping but no OOMing (it's an ancient 2.6.24
kernel I build, modern one needing a lot more space with a lot less in use).

How much of that is relevant I don't know: hopefully none of it, it's
hard to get the tunings right from scratch.  To answer your specific
question: yes, I'm not doing concurrent swapoffs in this test showing
the leak, just waiting for each of the pair of builds to complete,
then tearing down the trees, doing swapoff followed by swapon, and
starting a new pair of builds.

Sometimes it's the swapoff that fails with ENOMEM, more often it's a
fork during build that fails with ENOMEM: after 6 or 7 hours of load
(but timings show it getting slower leading up to that).  /proc/meminfo
did not give me an immediate clue, Slab didn't look surprising but
I may not have studied close enough.

I quilt-bisected it as far as the mm-swap series, good before, bad
after, but didn't manage to narrow it down further because of hitting
a presumably different issue inside the series, where swapoff ENOMEMed
much sooner (after 25 mins one time, during first iteration the next).

Hugh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ