linux-kernel - Re: [PATCH 0/9] Avoid populating unbounded num of ptes with mmap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrVGvfm2VHUaVNDg40U4dbsRmriW7GfRnfpHGihG9v1=Uw@mail.gmail.com>
Date:	Fri, 4 Jan 2013 10:16:21 -0800
From:	Andy Lutomirski <luto@...capital.net>
To:	Michel Lespinasse <walken@...gle.com>
Cc:	Ingo Molnar <mingo@...nel.org>, Al Viro <viro@...iv.linux.org.uk>,
	Hugh Dickins <hughd@...gle.com>, Jorn_Engel <joern@...fs.org>,
	Rik van Riel <riel@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/9] Avoid populating unbounded num of ptes with mmap_sem held

On Thu, Dec 20, 2012 at 4:49 PM, Michel Lespinasse <walken@...gle.com> wrote:
> We have many vma manipulation functions that are fast in the typical case,
> but can optionally be instructed to populate an unbounded number of ptes
> within the region they work on:
> - mmap with MAP_POPULATE or MAP_LOCKED flags;
> - remap_file_pages() with MAP_NONBLOCK not set or when working on a
>   VM_LOCKED vma;
> - mmap_region() and all its wrappers when mlock(MCL_FUTURE) is in effect;
> - brk() when mlock(MCL_FUTURE) is in effect.
>
> Current code handles these pte operations locally, while the sourrounding
> code has to hold the mmap_sem write side since it's manipulating vmas.
> This means we're doing an unbounded amount of pte population work with
> mmap_sem held, and this causes problems as Andy Lutomirski reported
> (we've hit this at Google as well, though it's not entirely clear why
> people keep trying to use mlock(MCL_FUTURE) in the first place).
>
> I propose introducing a new mm_populate() function to do this pte
> population work after the mmap_sem has been released. mm_populate()
> does need to acquire the mmap_sem read side, but critically, it
> doesn't need to hold continuously for the entire duration of the
> operation - it can drop it whenever things take too long (such as when
> hitting disk for a file read) and re-acquire it later on.
>

I still have quite a few instances of 2-6 ms of latency due to
"call_rwsem_down_read_failed __do_page_fault do_page_fault
page_fault".  Any idea why?  I don't know any great way to figure out
who is holding mmap_sem at the time.  Given what my code is doing, I
suspect the contention is due to mmap or munmap on a file.  MCL_FUTURE
is set, and MAP_POPULATE is not set.

It could be the other thread calling mmap and getting preempted (or
otherwise calling schedule()).  Grr.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/