linux-kernel - Re: [RFC v3 PATCH 4/5] mm: mmap: zap pages with read mmap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20180629183501.9e30c26135f11853245c56c7@linux-foundation.org>
Date:   Fri, 29 Jun 2018 18:35:01 -0700
From:   Andrew Morton <akpm@...ux-foundation.org>
To:     Yang Shi <yang.shi@...ux.alibaba.com>
Cc:     mhocko@...nel.org, willy@...radead.org, ldufour@...ux.vnet.ibm.com,
        peterz@...radead.org, mingo@...hat.com, acme@...nel.org,
        alexander.shishkin@...ux.intel.com, jolsa@...hat.com,
        namhyung@...nel.org, tglx@...utronix.de, hpa@...or.com,
        linux-mm@...ck.org, x86@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC v3 PATCH 4/5] mm: mmap: zap pages with read mmap_sem for
 large mapping

On Sat, 30 Jun 2018 06:39:44 +0800 Yang Shi <yang.shi@...ux.alibaba.com> wrote:

And...

> diff --git a/mm/mmap.c b/mm/mmap.c
> index 87dcf83..d61e08b 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2763,6 +2763,128 @@ static int munmap_lookup_vma(struct mm_struct *mm, struct vm_area_struct **vma,
>  	return 1;
>  }
>  
> +/* Consider PUD size or 1GB mapping as large mapping */
> +#ifdef HPAGE_PUD_SIZE
> +#define LARGE_MAP_THRESH	HPAGE_PUD_SIZE
> +#else
> +#define LARGE_MAP_THRESH	(1 * 1024 * 1024 * 1024)
> +#endif

So this assumes that 32-bit machines cannot have 1GB mappings (fair
enough) and this is the sole means by which we avoid falling into the
"len >= LARGE_MAP_THRESH" codepath, which will behave very badly, at
least because for such machines, VM_DEAD is zero.

This is rather ugly and fragile.  And, I guess, explains why we can't
give all mappings this treatment: 32-bit machines can't do it.  And
we're adding a bunch of code to 32-bit kernels which will never be
executed.

I'm thinking it would be better to be much more explicit with "#ifdef
CONFIG_64BIT" in this code, rather than relying upon the above magic.

But I tend to think that the fact that we haven't solved anything on
locked vmas or on uprobed mappings is a shostopper for the whole
approach :(