[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPFHKzduJiebgnAAjEvx4vBJCFn7-eyfJ+k6JQja2waxqKeCwQ@mail.gmail.com>
Date: Mon, 8 Nov 2021 00:24:33 -0500
From: Jonathon Reinhart <jonathon.reinhart@...il.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Antoine Tenart <atenart@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Linux Netdev List <netdev@...r.kernel.org>,
tglx@...utronix.de, peterz@...radead.org,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH net-next] net: sysctl data could be in .bss
On Fri, Oct 22, 2021 at 4:01 PM Jakub Kicinski <kuba@...nel.org> wrote:
>
> Widening the CC list a little.
>
> On Wed, 20 Oct 2021 10:38:54 +0200 Antoine Tenart wrote:
> > A check is made when registering non-init netns sysctl files to ensure
> > their data pointer does not point to a global data section. This works
> > well for modules as the check is made against the whole module address
> > space (is_module_address). But when built-in, the check is made against
> > the .data section. However global variables initialized to 0 can be in
> > .bss (-fzero-initialized-in-bss).
> >
> > Add an extra check to make sure the sysctl data does not point to the
> > .bss section either.
> >
> > Signed-off-by: Antoine Tenart <atenart@...nel.org>
> > Reviewed-by: Jonathon Reinhart <jonathon.reinhart@...il.com>
> > ---
> > Hello,
> >
> > This was previously sent as an RFC[1] waiting for a problematic sysctl
> > to be fixed. The fix was accepted and is now in the nf tree[2].
> >
> > This is not sent as a fix to avoid possible new warnings in stable
> > kernels. (The actual fixes of sysctl files should go).
> >
> > I think this can go through the net-next tree as kernel/extable.c
> > doesn't seem to be under any subsystem and a conflict is unlikely to
> > happen.
>
> > [1] https://lore.kernel.org/all/20211012155542.827631-1-atenart@kernel.org/T/
> > [2] https://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git/commit/?id=174c376278949c44aad89c514a6b5db6cee8db59
> >
> > include/linux/kernel.h | 1 +
> > kernel/extable.c | 8 ++++++++
> > net/sysctl_net.c | 2 +-
> > 3 files changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > index 2776423a587e..beb61d0ab220 100644
> > --- a/include/linux/kernel.h
> > +++ b/include/linux/kernel.h
> > @@ -231,6 +231,7 @@ extern char *next_arg(char *args, char **param, char **val);
> > extern int core_kernel_text(unsigned long addr);
> > extern int init_kernel_text(unsigned long addr);
> > extern int core_kernel_data(unsigned long addr);
> > +extern int core_kernel_bss(unsigned long addr);
>
> Is the intention of these helpers to have strict section name semantics
> or higher level "is this global kernel data" semantics? If it's the
> latter we could make core_kernel_data() check bss instead, chances are
> all callers will either want that or not care. Steven?
The core_kernel_data() function was introduced in a2d063ac216c1, and
the commit message says:
"It may or may not return true for RO data... This utility function is
used to determine if data is safe from ever being freed. Thus it
should return true for all RW global data that is not in a module or
has been allocated, or false otherwise."
The intent of the function seems to be more in line with the
higher-level "is this global kernel data" semantics you suggested. The
purpose seems to be to differentiate between "part of the loaded
kernel image" vs. a dynamic allocation (which would include a loaded
module image). And given that it *might* return true for RO data
(depending on the arch linker script, presumably), I think it would be
safe to include .bss -- clearly, with that caveat in place, it isn't
promising strict section semantics.
There are only two existing in-tree consumers:
1. __register_ftrace_function() [kernel/trace/ftrace.c] -- Sets
FTRACE_OPS_FL_DYNAMIC if core_kernel_data(ops) returns false, which
denotes "dynamically allocated ftrace_ops which need special care". It
would be unlikely (if not impossible) for the "ops" object in question
to be all-zero and end up in the .bss, but if it were, then the
current behavior would be wrong. IOW, it would be more correct to
include .bss.
2. ensure_safe_net_sysctl() [net/sysctl_net.c] (The subject of this
thread) -- Trying to distinguish "global kernel data" (static/global
variables) from kmalloc-allocated objects. More correct to include
.bss.
Both of these callers only seem to delineate between static and
dynamic object allocations. Put another way, if core_kernel_bss(), all
existing callers should be updated to check core_kernel_data() ||
core_kernel_bss().
Since Steven introduced it, and until I added
ensure_safe_net_sysctl(), he / tracing was the only consumer.
Thinking critically from the C language perspective, I can't come up
with any case where one would actually expect core_kernel_data() to
return true for 'int global = 1' and false for 'int global = 0'.
In conclusion, I agree with your alternative proposal Jakub, and I
think this patch is the right way forward:
diff --git a/kernel/extable.c b/kernel/extable.c
index b0ea5eb0c3b4..8b6f1d0bdaf6 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -97,6 +97,9 @@ int core_kernel_data(unsigned long addr)
if (addr >= (unsigned long)_sdata &&
addr < (unsigned long)_edata)
return 1;
+ if (addr >= (unsigned long)__bss_start &&
+ addr < (unsigned long)__bss_stop)
+ return 1;
return 0;
}
Jonathon Reinhart
Powered by blists - more mailing lists