linux-kernel - Re: [RFC 00/14] Dynamic Kernel Stacks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <fsergr6i66i7tagtqj3m7yz4tkjv5z75uslsqpfzmtzdpv3yiv@znauhs7rd4f3>
Date: Sat, 16 Mar 2024 21:32:48 -0400
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Matthew Wilcox <willy@...radead.org>
Cc: Pasha Tatashin <pasha.tatashin@...een.com>, 
	"H. Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org, linux-mm@...ck.org, 
	akpm@...ux-foundation.org, x86@...nel.org, bp@...en8.de, brauner@...nel.org, 
	bristot@...hat.com, bsegall@...gle.com, dave.hansen@...ux.intel.com, 
	dianders@...omium.org, dietmar.eggemann@....com, eric.devolder@...cle.com, 
	hca@...ux.ibm.com, hch@...radead.org, jacob.jun.pan@...ux.intel.com, jgg@...pe.ca, 
	jpoimboe@...nel.org, jroedel@...e.de, juri.lelli@...hat.com, kinseyho@...gle.com, 
	kirill.shutemov@...ux.intel.com, lstoakes@...il.com, luto@...nel.org, mgorman@...e.de, 
	mic@...ikod.net, michael.christie@...cle.com, mingo@...hat.com, mjguzik@...il.com, 
	mst@...hat.com, npiggin@...il.com, peterz@...radead.org, pmladek@...e.com, 
	rick.p.edgecombe@...el.com, rostedt@...dmis.org, surenb@...gle.com, tglx@...utronix.de, 
	urezki@...il.com, vincent.guittot@...aro.org, vschneid@...hat.com
Subject: Re: [RFC 00/14] Dynamic Kernel Stacks

On Sun, Mar 17, 2024 at 12:41:33AM +0000, Matthew Wilcox wrote:
> On Sat, Mar 16, 2024 at 03:17:57PM -0400, Pasha Tatashin wrote:
> > Expanding on Mathew's idea of an interface for dynamic kernel stack
> > sizes, here's what I'm thinking:
> > 
> > - Kernel Threads: Create all kernel threads with a fully populated
> > THREAD_SIZE stack.  (i.e. 16K)
> > - User Threads: Create all user threads with THREAD_SIZE kernel stack
> > but only the top page mapped. (i.e. 4K)
> > - In enter_from_user_mode(): Expand the thread stack to 16K by mapping
> > three additional pages from the per-CPU stack cache. This function is
> > called early in kernel entry points.
> > - exit_to_user_mode(): Unmap the extra three pages and return them to
> > the per-CPU cache. This function is called late in the kernel exit
> > path.
> > 
> > Both of the above hooks are called with IRQ disabled on all kernel
> > entries whether through interrupts and syscalls, and they are called
> > early/late enough that 4K is enough to handle the rest of entry/exit.
> 
> At what point do we replenish the per-CPU stash of pages?  If we're
> 12kB deep in the stack and call mutex_lock(), we can be scheduled out,
> and then the new thread can make a syscall.  Do we just assume that
> get_free_page() can sleep at kernel entry (seems reasonable)?  I don't
> think this is an infeasible problem, I'd just like it to be described.

schedule() or return to userspace, I believe was mentioned