lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=whRpLyY+U9mkKo8O=2_BXNk=7sjYeObzFr3fGi0KLjLJw@mail.gmail.com>
Date: Fri, 5 Jul 2024 10:39:48 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: "Jason A. Donenfeld" <Jason@...c4.com>
Cc: jolsa@...nel.org, mhiramat@...nel.org, cgzones@...glemail.com, 
	brauner@...nel.org, linux-kernel@...r.kernel.org, arnd@...db.de
Subject: Re: deconflicting new syscall numbers for 6.11

On Fri, 5 Jul 2024 at 09:18, Jason A. Donenfeld <Jason@...c4.com> wrote:
>
> VM_DROPPABLE *is* actually a very useful feature. Or it at least seems
> like it could be one.

Yes. It's been discussed exactly in that "this _could_ be very useful"
sense, although we've never actually pulled the trigger.

I tried to find previous discussions on lore, but failed miserably, so
I can't point to previous discussions from long ago, but one question
was also always about whether you wanted some explicit "populate this
page range" interface together with getting a SIGBUS when it's
unpopulated (so that you can basically do demand-paging in user
space).

With just a "this could be useful" but no hard users, it never really
got anywhere.

Anyway, I really don't mind VM_DROPPABLE with "it just gets
re-populated as a new anonymous page" model, particularly since we
could easily then later decide that we could expand on it as a
MAP_SHARED thing with SIGBUS semantics and explicit initialization if
we ever really want it.

End result: I don't think there are necessariyl *lots* of users, but I
do think that this is something where some enterprising person goes "I
can use this", and makes some cool library that uses it for caching,
and then we'd be stuck with it.

> And then, indeed, it'd make sense to eventually expose this properly to
> mmap() and let people use it. (Or if you want to do that in reverse,
> adding it to mmap() first, so that people don't misuse
> vgetrandom_alloc(), that's fine.)

Yes. And it should be pretty trivial.

We just at least initially have to be very careful to limit it to
MAP_ANONYMOUS and MAP_PRIVATE. Because dropping dirty bits on shared
mappings sounds insane and like a possible source of confusion (and
thus bugs and maybe even security issues).

It's possible that we might even use a MAP_TYPE flag for this. Or make
it a PROT_xyz bit rather than a MAP_xyz.

So there's some trivial sanity checks and some UI issues to just pick,
but apart from "just pick something sane", exposing this for mmap() is
_not_ hard, and I do think it needs to be done first.

And once it's done, I think the argument for having a special system
call is basically gone too.

> - The "mechanism" needs to return allocated memory to userspace that can
>   be chunked up on a per-thread basis, with no state straddling pages,
>   which means it also needs to return the size of each state, and the
>   number of states that were allocated.
>
> - The size of each state might change kernel version to kernel version.

Just pick a size large enough.

And why would that size not  be one page?

Considering that you really don't want to rely on page-crossing state
*ANYWAY* because of the whole "one page can go away while another one
sticks around" issue, I would expect that states over one page per
thread would be a *very* questionable idea to begin with.

I don't think we'll ever see systems with page sizes smaller than 4k.
They have existed in the past, but they're not making a comeback.
People want larger pages, not smaller ones.

And the stat size rigth now is what - 200 bytes? So a single page
seems (a) sufficient and (b) kind of the sane maximum anyway due to
the dropping.

No?

              Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ