linux-kernel - Re: FYI: mmap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <1278585009.1900.31.camel@laptop>
Date:	Thu, 08 Jul 2010 12:30:09 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Michel Lespinasse <walken@...gle.com>
Cc:	linux-mm <linux-mm@...ck.org>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Divyesh Shah <dpshah@...gle.com>, Ingo Molnar <mingo@...e.hu>
Subject: Re: FYI: mmap_sem OOM patch

On Wed, 2010-07-07 at 16:11 -0700, Michel Lespinasse wrote:
> What happens is we end up with a single thread in the oom loop (T1)
> that ends up killing a sibling thread (T2).  That sibling thread will
> need to acquire the read side of the mmap_sem in the exit path.  It's
> possible however that yet a different thread (T3) is in the middle of
> a virtual address space operation (mmap, munmap) and is enqueue to
> grab the write side of the mmap_sem behind yet another thread (T4)
> that is stuck in the OOM loop (behind T1) with mmap_sem held for read
> (like allocating a page for pagecache as part of a fault.
> 
>       T1              T2              T3              T4
>       .               .               .               .
>    oom:               .               .               .
>    oomkill            .               .               .
>       ^    \          .               .               .
>      /|\    ---->  do_exit:           .               .
>       |            sleep in           .               .
>       |            read(mmap_sem)     .               .
>       |                     \         .               .
>       |                      ----> mmap               .
>       |                            sleep in           .
>       |                            write(mmap_sem)    .
>       |                                     \         .
>       |                                      ----> fault
>       |                                            holding read(mmap_sem)
>       |                                            oom
>       |                                               |
>       |                                               /
>       \----------------------------------------------/ 

So what you do is use recursive locking to side-step a deadlock.
Recursive locking is poor taste and leads to very ill defined locking
rules.

One way to fix this is to have T4 wake from the oom queue and return an
allocation failure instead of insisting on going oom itself when T1
decides to take down the task.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/