[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f5ce3975-bda6-0e83-3a59-2fac25cc4f08@rasmusvillemoes.dk>
Date: Fri, 19 Mar 2021 14:31:19 +0100
From: Rasmus Villemoes <linux@...musvillemoes.dk>
To: Peter Zijlstra <peterz@...radead.org>, x86@...nel.org,
jpoimboe@...hat.com, jbaron@...mai.com, rostedt@...dmis.org,
ardb@...nel.org
Cc: linux-kernel@...r.kernel.org, sumit.garg@...aro.org,
oliver.sang@...el.com, jarkko@...nel.org
Subject: Re: [PATCH 2/3] static_call: Align static_call_is_init() patching
condition
On 18/03/2021 12.31, Peter Zijlstra wrote:
> The intent is to avoid writing init code after init (because the text
> might have been freed). The code is needlessly different between
> jump_label and static_call and not obviously correct.
>
> The existing code relies on the fact that the module loader clears the
> init layout, such that within_module_init() always fails, while
> jump_label relies on the module state which is more obvious and
> matches the kernel logic.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> ---
> kernel/static_call.c | 14 ++++----------
> 1 file changed, 4 insertions(+), 10 deletions(-)
>
> --- a/kernel/static_call.c
> +++ b/kernel/static_call.c
> @@ -149,6 +149,7 @@ void __static_call_update(struct static_
> };
>
> for (site_mod = &first; site_mod; site_mod = site_mod->next) {
> + bool init = system_state < SYSTEM_RUNNING;
I recently had occasion to look at whether that would be a suitable
condition for knowing whether __init stuff was gone, but concluded that
it's not. Maybe I'm wrong. init/main.c:
free_initmem();
mark_readonly();
/*
* Kernel mappings are now finalized - update the userspace
page-table
* to finalize PTI.
*/
pti_finalize();
system_state = SYSTEM_RUNNING;
So ISTM there's window where system_state < SYSTEM_RUNNING but accessing
init stuff is a bad idea. If you're PID1 it's all fine, but I suppose
other kernel threads could end up calling static_call_update. And just
moving the system_state setting up before the *free_initmem() calls
doesn't really solve anything because TOCTOU.
Dunno, probably overkill, but perhaps we could have an atomic_t (or
refcount, whatever) init_ref inited to 1, with init_ref_get() doing an
inc_unless_zero, and iff you get a ref, you're free to call (/patch)
__init functions and access __initdata, but must do init_ref_put(), with
PID1 dropping its initial ref and waiting for it to drop to 0 before
doing the *free_initmem() calls.
Rasmus
Powered by blists - more mailing lists