[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211108111114.2e37c9d6@gandalf.local.home>
Date: Mon, 8 Nov 2021 11:11:14 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: Jonathon Reinhart <jonathon.reinhart@...il.com>
Cc: Jakub Kicinski <kuba@...nel.org>,
Antoine Tenart <atenart@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Linux Netdev List <netdev@...r.kernel.org>,
tglx@...utronix.de, peterz@...radead.org
Subject: Re: [PATCH net-next] net: sysctl data could be in .bss
On Mon, 8 Nov 2021 00:24:33 -0500
Jonathon Reinhart <jonathon.reinhart@...il.com> wrote:
> On Fri, Oct 22, 2021 at 4:01 PM Jakub Kicinski <kuba@...nel.org> wrote:
> >
> > Widening the CC list a little.
Thanks!
[..]
> The core_kernel_data() function was introduced in a2d063ac216c1, and
> the commit message says:
>
> "It may or may not return true for RO data... This utility function is
> used to determine if data is safe from ever being freed. Thus it
> should return true for all RW global data that is not in a module or
> has been allocated, or false otherwise."
>
> The intent of the function seems to be more in line with the
> higher-level "is this global kernel data" semantics you suggested. The
> purpose seems to be to differentiate between "part of the loaded
> kernel image" vs. a dynamic allocation (which would include a loaded
> module image). And given that it *might* return true for RO data
> (depending on the arch linker script, presumably), I think it would be
> safe to include .bss -- clearly, with that caveat in place, it isn't
> promising strict section semantics.
>
> There are only two existing in-tree consumers:
>
> 1. __register_ftrace_function() [kernel/trace/ftrace.c] -- Sets
> FTRACE_OPS_FL_DYNAMIC if core_kernel_data(ops) returns false, which
> denotes "dynamically allocated ftrace_ops which need special care". It
> would be unlikely (if not impossible) for the "ops" object in question
> to be all-zero and end up in the .bss, but if it were, then the
> current behavior would be wrong. IOW, it would be more correct to
> include .bss.
>
> 2. ensure_safe_net_sysctl() [net/sysctl_net.c] (The subject of this
> thread) -- Trying to distinguish "global kernel data" (static/global
> variables) from kmalloc-allocated objects. More correct to include
> .bss.
>
> Both of these callers only seem to delineate between static and
> dynamic object allocations. Put another way, if core_kernel_bss(), all
> existing callers should be updated to check core_kernel_data() ||
> core_kernel_bss().
>
> Since Steven introduced it, and until I added
> ensure_safe_net_sysctl(), he / tracing was the only consumer.
I agree with your analysis.
The intent is that allocated ftrace_ops (things that function tracer uses
to know what callbacks are called from function entry), must go through a
very slow synchronization (rcu_synchronize_tasks). But this is not needed
if the ftrace_ops is part of the core kernel (.data or .bss) as that will
never be freed, and thus does not need to worry about it disappearing while
they are still in use.
>
> Thinking critically from the C language perspective, I can't come up
> with any case where one would actually expect core_kernel_data() to
> return true for 'int global = 1' and false for 'int global = 0'.
>
> In conclusion, I agree with your alternative proposal Jakub, and I
> think this patch is the right way forward:
>
> diff --git a/kernel/extable.c b/kernel/extable.c
> index b0ea5eb0c3b4..8b6f1d0bdaf6 100644
> --- a/kernel/extable.c
> +++ b/kernel/extable.c
> @@ -97,6 +97,9 @@ int core_kernel_data(unsigned long addr)
> if (addr >= (unsigned long)_sdata &&
> addr < (unsigned long)_edata)
> return 1;
> + if (addr >= (unsigned long)__bss_start &&
> + addr < (unsigned long)__bss_stop)
> + return 1;
> return 0;
> }
Acked-by: Steven Rostedt (VMware) <rostedt@...dmis.org>
-- Steve
Powered by blists - more mailing lists