[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251130190331.40385ddc@pumpkin>
Date: Sun, 30 Nov 2025 19:03:31 +0000
From: david laight <david.laight@...box.com>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, linux-alpha@...r.kernel.org
Subject: Re: [RFC][alpha] saner vmalloc handling (was Re: [Bug report]
hash_name() may cross page boundary and trigger sleep in RCU context)
On Sun, 30 Nov 2025 16:43:48 +0000
Al Viro <viro@...iv.linux.org.uk> wrote:
> On Sun, Nov 30, 2025 at 11:32:13AM +0000, david laight wrote:
>
> > How difficult would it be to allocate the pte for the next 8GB on demand
> > inside vmalloc(), and then propagate it to the per-task page tables.
> > That is a path than can sleep, so being slow if it needs to synchronise
> > with other cpu shouldn't matter - especially since it won't happen often.
> >
> > That should be moderately generic code and would let the vmalloc limit
> > be 'soft'; perhaps based on physical memory size, and even be raisable
> > from a sysctl.
>
> Considerable headache and pretty pointless, at that. Note that >8G vmalloc
> space on alpha had been racy all along (and known to be that); it was
> basically "could we squeeze more out of khttpd" kind of fun.
>
> Do we have realistic vmalloc-crazy loads with high fragmentation of vmalloc
> space and total footprint worth bothering with that?
>
I doubt it matters for alpha - I suspect you could just nuke ALPHA_LARGE_VMALLOC.
At a guess it was written way back when the biggest/fastest systems you could
get were alpha.
I was more thinking about about modern 64 bits systems where you might want
to run a distro kernel on systems with relatively small amounts of RAM and
others with 100s of cpu and multi TB of RAM.
I can well image workloads for the latter that might run out of vmalloc space.
In some situations even getting a command line parameter in can be hard,
so you might want it to be a systcl - even if changing that is what does the
update.
(Doing the updates in the page fault handler definitely sounds like a recipe
for disaster.)
Note that I've not looked at where amd64 gets the limit for mem_init().
Maybe it tries to 'guess' the correct value for the system.
But it is likely to be workload related - so allocating 8K for every 8G
of physical memory (one option) may be wasteful.
David
Powered by blists - more mailing lists