linux-kernel - Re: [PATCH 0/9] Avoid populating unbounded num of ptes with mmap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANN689G-+Dns7BEJVG1SNO_CYA1vCEhiyf7F90sKYPvrNsXN9w@mail.gmail.com>
Date:	Sat, 22 Dec 2012 01:37:25 -0800
From:	Michel Lespinasse <walken@...gle.com>
To:	Andy Lutomirski <luto@...capital.net>
Cc:	Ingo Molnar <mingo@...nel.org>, Al Viro <viro@...iv.linux.org.uk>,
	Hugh Dickins <hughd@...gle.com>, Jorn_Engel <joern@...fs.org>,
	Rik van Riel <riel@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/9] Avoid populating unbounded num of ptes with mmap_sem held

On Fri, Dec 21, 2012 at 6:16 PM, Andy Lutomirski <luto@...capital.net> wrote:
> On Fri, Dec 21, 2012 at 5:59 PM, Michel Lespinasse <walken@...gle.com> wrote:
>> Could you share your test case so I can try reproducing the issue
>> you're seeing ?
>
> Not so easy.  My test case is a large chunk of a high-frequency
> trading system :)

Huh, its probably better if I don't see it then :)

> I just tried it again.  Not I have a task stuck in
> mlockall(MCL_CURRENT|MCL_FUTURE).  The stack is:
>
> [<0000000000000000>] flush_work+0x1c2/0x280
> [<0000000000000000>] schedule_on_each_cpu+0xe3/0x130
> [<0000000000000000>] lru_add_drain_all+0x15/0x20
> [<0000000000000000>] sys_mlockall+0x125/0x1a0
> [<0000000000000000>] tracesys+0xd0/0xd5
> [<0000000000000000>] 0xffffffffffffffff
>
> The sequence of mmap and munmap calls, according to strace, is:
>
[...]
> 6084  mmap(0x7f54fd02a000, 6776, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f54fd02a000

So I noticed you use mmap with a size that is not a multiple of
PAGE_SIZE. This is perfectly legal, but I hadn't tested that case, and
lo and behold, it's something I got wrong. Patch to be sent as a reply
to this. Without this patch, vm_populate() will show a debug message
if you have CONFIG_DEBUG_VM set, and likely spin in an infinite loop
if you don't.

> 6084  mmap(NULL, 26258, PROT_READ, MAP_SHARED, 4, 0) = 0x7f5509f9d000
> 6084  mmap(NULL, 4096, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5509f9c000
> 6084  munmap(0x7f5509f9c000, 4096)      = 0
> 6084  mlockall(MCL_CURRENT|MCL_FUTURE
>
> This task is unkillable.  Two other tasks are stuck spinning.

Now I'm confused, because:

1- your trace shows the hang occurs during mlockall(), and this code
really wasn't touched much in my series (besides renaming
do_mlock_pages into __mm_populate())

2- the backtrace above showed sys_mlockall() -> lru_add_drain_all(),
which is the very beginning of mlockall(), before anything of
importance happens (and in particular, before the MCL_FUTURE flag
takes action). So, I'm going to assume that it's one of the other
spinning threads that is breaking things. If one of the spinning
threads got stuck within vm_populate(), this could even be explained
by the bug I mentioned above.

Could you check if the fix I'm going to send as a reply to this works
for you, and if not, where the two spinning threads are being stuck ?

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/