linux-kernel - Re: [RFC v3 PATCH 4/5] mm: mmap: zap pages with read mmap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20180704081347.GG22503@dhcp22.suse.cz>
Date:   Wed, 4 Jul 2018 10:13:47 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Yang Shi <yang.shi@...ux.alibaba.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>, willy@...radead.org,
        ldufour@...ux.vnet.ibm.com, peterz@...radead.org, mingo@...hat.com,
        acme@...nel.org, alexander.shishkin@...ux.intel.com,
        jolsa@...hat.com, namhyung@...nel.org, tglx@...utronix.de,
        hpa@...or.com, linux-mm@...ck.org, x86@...nel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC v3 PATCH 4/5] mm: mmap: zap pages with read mmap_sem for
 large mapping

On Tue 03-07-18 11:22:17, Yang Shi wrote:
> 
> 
> On 7/2/18 11:09 PM, Michal Hocko wrote:
> > On Mon 02-07-18 13:48:45, Andrew Morton wrote:
> > > On Mon, 2 Jul 2018 16:05:02 +0200 Michal Hocko <mhocko@...nel.org> wrote:
> > > 
> > > > On Fri 29-06-18 20:15:47, Andrew Morton wrote:
> > > > [...]
> > > > > Would one of your earlier designs have addressed all usecases?  I
> > > > > expect the dumb unmap-a-little-bit-at-a-time approach would have?
> > > > It has been already pointed out that this will not work.
> > > I said "one of".  There were others.
> > Well, I was aware only about two potential solutions. Either do the
> > heavy lifting under the shared lock and do the rest with the exlusive
> > one and this, drop the lock per parts. Maybe I have missed others?
> > 
> > > > You simply
> > > > cannot drop the mmap_sem during unmap because another thread could
> > > > change the address space under your feet. So you need some form of
> > > > VM_DEAD and handle concurrent and conflicting address space operations.
> > > Unclear that this is a problem.  If a thread does an unmap of a range
> > > of virtual address space, there's no guarantee that upon return some
> > > other thread has not already mapped new stuff into that address range.
> > > So what's changed?
> > Well, consider the following scenario:
> > Thread A = calling mmap(NULL, sizeA)
> > Thread B = calling munmap(addr, sizeB)
> > 
> > They do not use any external synchronization and rely on the atomic
> > munmap. Thread B only munmaps range that it knows belongs to it (e.g.
> > called mmap in the past). It should be clear that ThreadA should not
> > get an address from the addr, sizeB range, right? In the most simple case
> > it will not happen. But let's say that the addr, sizeB range has
> > unmapped holes for what ever reasons. Now anytime munmap drops the
> > exclusive lock after handling one VMA, Thread A might find its sizeA
> > range and use it. ThreadB then might remove this new range as soon as it
> > gets its exclusive lock again.
> 
> I'm a little bit confused here. If ThreadB already has unmapped that range,
> then ThreadA uses it. It sounds not like a problem since ThreadB should just
> go ahead to handle the next range when it gets its exclusive lock again,
> right? I don't think of why ThreadB would re-visit that range to remove it.

Not if the new range overlap with the follow up range that ThreadB does.
Example

B: munmap [XXXXX]    [XXXXXX] [XXXXXXXXXX]
B: breaks the lock after processing the first vma.
A: mmap [XXXXXXXXXXXX]
B: munmap retakes the lock and revalidate from the last vm_end because
   the old vma->vm_next might be gone
B:               [XXX][XXXXX] [XXXXXXXXXX]

so you munmap part of the range. Sure you can plan some tricks and skip
over vmas that do not start above your last vma->vm_end or something
like that but I expect there are other can of worms hidden there.
-- 
Michal Hocko
SUSE Labs