linux-kernel - Re: [PATCH v7 00/10] per lruvec lru

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LSU.2.11.2001131157500.1084@eggly.anvils>
Date:   Mon, 13 Jan 2020 12:20:48 -0800 (PST)
From:   Hugh Dickins <hughd@...gle.com>
To:     Alex Shi <alex.shi@...ux.alibaba.com>
cc:     Hugh Dickins <hughd@...gle.com>, hannes@...xchg.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, mgorman@...hsingularity.net, tj@...nel.org,
        khlebnikov@...dex-team.ru, daniel.m.jordan@...cle.com,
        yang.shi@...ux.alibaba.com, willy@...radead.org,
        shakeelb@...gle.com
Subject: Re: [PATCH v7 00/10] per lruvec lru_lock for memcg

On Mon, 13 Jan 2020, Alex Shi wrote:
> 在 2020/1/13 下午4:48, Hugh Dickins 写道:
> > 
> > I (Hugh) tried to test it on v5.5-rc5, but did not get very far at all -
> > perhaps because my particular interest tends towards tmpfs and swap,
> > and swap always made trouble for lruvec lock - one of the reasons why
> > our patches were more complicated than you thought necessary.
> > 
> > Booted a smallish kernel in mem=700M with 1.5G of swap, with intention
> > of running small kernel builds in tmpfs and in ext4-on-loop-on-tmpfs
> > (losetup was the last command started but I doubt it played much part):
> > 
> > mount -t tmpfs -o size=470M tmpfs /tst
> > cp /dev/zero /tst
> > losetup /dev/loop0 /tst/zero
> 
> Hi Hugh,
> 
> Many thanks for the testing!
> 
> I am trying to reproduce your testing, do above 3 steps, then build kernel with 'make -j 8' on my qemu. but cannot reproduce the problem with this v7 version or with v8 version, https://github.com/alexshi/linux/tree/lru-next, which fixed the bug KK mentioned, like the following. 
> my qemu vmm like this:
> 
> [root@...ug010000002015 ~]# mount -t tmpfs -o size=470M tmpfs /tst
> [root@...ug010000002015 ~]# cp /dev/zero /tst
> cp: error writing ‘/tst/zero’: No space left on device
> cp: failed to extend ‘/tst/zero’: No space left on device
> [root@...ug010000002015 ~]# losetup /dev/loop0 /tst/zero
> [root@...ug010000002015 ~]# cat /proc/cmdline
> earlyprintk=ttyS0 root=/dev/sda1 console=ttyS0 debug crashkernel=128M printk.devkmsg=on
> 
> my kernel configed with MEMCG/MEMCG_SWAP with xfs rootimage, and compiling kernel under ext4. Could you like to share your kernel config and detailed reproduce steps with me? And would you like to try my new version from above github link in your convenient?

I tried with the mods you had appended, from [PATCH v7 02/10]
discussion with Konstantion: no, still crashes in a similar way.

Does your github tree have other changes too?  I see it says "Latest
commit e05d0dd 22 days ago", which doesn't seem to fit.  Afraid I
don't have time to test many variations.

It looks like, in my case, systemd was usually jumping in and doing
something with shmem (perhaps via memfd) that read back from swap
and triggered the crash without any further intervention from me.

So please try booting with mem=700M and 1.5G swap,
mount -t tmpfs -o size=470M tmpfs /tst
cp /dev/zero /tst; cp /tst/zero /dev/null

That's enough to crash it for me, without getting into any losetup or
systemd complications. But you might have to adjust the numbers to be
sure of writing out and reading back from swap.

It's swap to SSD in my case, don't think that matters. I happen to
run with swappiness 100 (precisely to help generate swap problems),
but swappiness 60 is good enough to get these crashes.

Hugh