lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 23 Sep 2021 12:40:14 +0100
From:   Matthew Wilcox <willy@...radead.org>
To:     Kent Overstreet <kent.overstreet@...il.com>
Cc:     linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, Johannes Weiner <hannes@...xchg.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "Darrick J. Wong" <djwong@...nel.org>,
        Christoph Hellwig <hch@...radead.org>,
        David Howells <dhowells@...hat.com>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Mike Kravetz <mike.kravetz@...cle.com>
Subject: Mapcount of subpages

On Thu, Sep 23, 2021 at 01:15:16AM -0400, Kent Overstreet wrote:
> On Thu, Sep 23, 2021 at 04:23:12AM +0100, Matthew Wilcox wrote:
> > (compiling that list reminds me that we'll need to sort out mapcount
> > on subpages when it comes time to do this.  ask me if you don't know
> > what i'm talking about here.)
> 
> I am curious why we would ever need a mapcount for just part of a page, tell me
> more.

I would say Kirill is the expert here.  My understanding:

We have three different approaches to allocating 2MB pages today;
anon THP, shmem THP and hugetlbfs.  Hugetlbfs can only be mapped on a
2MB boundary, so it has no special handling of mapcount [1].  Anon THP
always starts out as being mapped exclusively on a 2MB boundary, but
then it can be split by, eg, munmap().  If it is, then the mapcount in
the head page is distributed to the subpages.

Shmem THP is the tricky one.  You might have a 2MB page in the page cache,
but then have processes which only ever map part of it.  Or you might
have some processes mapping it with a 2MB entry and others mapping part
or all of it with 4kB entries.  And then someone truncates the file to
midway through this page; we split it, and now we need to figure out what
the mapcount should be on each of the subpages.  We handle this by using
->mapcount on each subpage to record how many non-2MB mappings there are
of that specific page and using ->compound_mapcount to record how many 2MB
mappings there are of the entire 2MB page.  Then, when we split, we just
need to distribute the compound_mapcount to each page to make it correct.
We also have the PageDoubleMap flag to tell us whether anybody has this
2MB page mapped with 4kB entries, so we can skip all the summing of 4kB
mapcounts if nobody has done that.

[1] Mike is looking to change this, but I'm not sure where he is with it.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ