lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <YiobsJ7h2nSPs+KW@casper.infradead.org>
Date:   Thu, 10 Mar 2022 15:39:28 +0000
From:   Matthew Wilcox <willy@...radead.org>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Linux-MM <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] Free up a page flag

On Wed, Mar 09, 2022 at 02:07:50PM -0800, Linus Torvalds wrote:
> On Wed, Mar 9, 2022 at 12:50 PM Matthew Wilcox <willy@...radead.org> wrote:
> >
> > We're always running out of page flags.  Here's an attempt to free one
> > up for the next time somebody wants one.
> 
> Ugh. This is too ugly for words.
> 
> I wouldn't mind something along the conceptual lines of "these bits
> are only used for this type", but I think it would need to be much
> more organized and explicit, not this kind of randomness.

OK.  Serves me right for trying to do this quickly instead of doing it
right.

> For example, quite a few of the page bits really only make sense for
> the "page cache and anonymous pages" kind.
> 
> I think this includes some really fundamental bits like the lock bit
> (and the associated waiters bit), along with a lot of the "owner" aka
> "this can be used by the filesystem" bits.
> 
> I think it _also_ includes all the LRU and workingset bits etc.
> 
> So if we consider that kind of case the "normal" case, the not-normal
> case is likely (a) slab, (b) reserved pages and (c) zspages.,

There's always more things that people allocate pages for than you think.
I have a (presumably incomplete) list here:
https://kernelnewbies.org/MemoryTypes

As I wrote there (and promptly forgot, so it's a good thing I wrote
it down), any page that gets mapped to userspace needs both the locked
and dirty bits.  And random device drivers allocate pages and map them
to userspace, so those bits need to be available, even for pages that
a device driver has decided to mark as Reserved (a hang-over from
when we used to require that for ioremap?)

So I think the flags end up looking like:

0	Locked
1	Writeback
2	Dirty
3	Head
4	Type (new name for xyzzy)
	(if type == 0)			(if type == 1)
5	Referenced			Slab
6	Active				Buddy
7	Waiters				Waiters
8	LRU				VMalloc
9	Workingset			Offline
10	Error				Table
11	OwnerPriv1			Guard
12	Private				Reserved
13	Private2
14	Uptodate/Reported
15	Arch1
16	MappedToDisk
17	Reclaim/Readahead/Isolated
18	Swapbacked
19	Unevictable
20	MLocked*
21	Uncached*
22	HWPoison*
23	Young*
24	Idle*
25	Arch2*
(* Kconfig dependent)

I might want to do another pass on this list to sort the flags
for all-LRU-pages before which ones change meaning depending on
file-vs-anon.  And maybe Reported shouldn't share a bit with Uptodate,
perhaps it's just part of the OwnerPriv1/Private/Private2 mess.

It does end up getting rid of the PageType mechanism, and letting us
type pages allocated to VMalloc.

> We already have some page flag bits that are only used for those kinds
> of odd pages: the page_flags field is used only for zspages, but other
> pages can (misuse) that field for PG_buddy/offline/etc. That whole
> thing is particularly ugly in how it tries to make sure there are is
> no mapcount use of it.

I think you meant page_type here?  I was proud of how much better I made
it than it was (people used to _both_ set bits in that field _and_
map pages with those bits set into userspace ... fortunately nobody
assigned a CVE to that problem).  But it's a little bit too clever,
and I'd love to be rid of it.  And it failed to accomplish my original
goal of being able to mark pages as being allocated by vmalloc.

I think zspages using ->page_type is probably wrong.  zspages should be
using its own type like slab, but I haven't done the work to split it
out yet.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ