[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <i5h3sxl34d5pddluwxfhlumkt5fatin3rsqbwpfcm2rceg46ix@w3c2l6ntu2ye>
Date: Wed, 22 Jan 2025 13:57:10 +0100
From: Joel Granados <joel.granados@...nel.org>
To: Wen Yang <wen.yang@...ux.dev>
Cc: "Eric W . Biederman" <ebiederm@...ssion.com>,
Luis Chamberlain <mcgrof@...nel.org>, Kees Cook <keescook@...omium.org>,
Christian Brauner <brauner@...nel.org>, Dave Young <dyoung@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: Re: [PATCH v5] sysctl: simplify the min/max boundary check
On Sun, Jan 19, 2025 at 10:59:21PM +0800, Wen Yang wrote:
>
>
> On 2025/1/16 17:37, Joel Granados wrote:
> > On Sun, Jan 05, 2025 at 11:28:53PM +0800, Wen Yang wrote:
> > > do_proc_dointvec_conv() used the default range of int type, while
> > > do_proc_dointvec_minmax_conv() additionally used "int * {min, max}" of
> > > struct do_proc_dointvec_minmax_conv_param, which are actually passed
> > > in table->extra{1,2} pointers.
> > >
> > > By changing the struct do_proc_dointvec_minmax_conv_param's
> > > "int * {min, max}" to "int {min, max}" and setting the min/max value
> > > via dereference the table->extra{1, 2}, those do_proc_dointvec_conv() and
> > > do_proc_dointvec_minmax_conv() functions can be merged into one and reduce
> > > code by one-third.
> > >
> > > Similar changes were also made for do_proc_douintvec_minmax_conv_param,
> > > do_proc_douintvec_conv() and do_proc_douintvec_minmax_conv().
> > >
> > Before moving forward, I need the motivation.
> >
> > Please answer this:
> > *Why* are you pushing these two [1], [2] series?
> >
> > Best
> >
> > [1]
> > * https://lore.kernel.org/tencent_A7FD9E9E75AA74A87583A427F5D20F988B09@qq.com
> > * https://lore.kernel.org/tencent_C5E6023F97E7CC2A046AAEA09BC9ACF43907@qq.com
> > * https://lore.kernel.org/cover.1726365007.git.wen.yang@linux.dev
> > * https://lore.kernel.org/cover.1726910671.git.wen.yang@linux.dev
> >
> > [2]
> > * https://lore.kernel.org/20240714173858.52163-1-wen.yang@linux.dev
> > * https://lore.kernel.org/20240715150223.54707-1-wen.yang@linux.dev
> > * https://lore.kernel.org/20240905134818.4104-1-wen.yang@linux.dev
> > * https://lore.kernel.org/20241201140058.5653-1-wen.yang@linux.dev
> >
>
> Thank you for your comments.
>
> Here is a brief report about it.
>
> The boundary check of sysctl uses .extra{1, 2} pointers, which need to point
> to constant variables (such as two_five_five, n_65535, ue_int_max, etc. that
> appear in multiple modules).
>
> We first try to stuff these variables into the shared const arrays (
> (sysctl_vals, sysctl_long_vals) to save memory, as shown in series 1.
>
> Thank Eric for pointing out:
> - You can still save a lot more by turning .extra1 and .extra2
> - into longs instead of keeping them as pointers and needing
> - constants to be pointed at somewhere.
>
> We have further found more detailed information about "kill the shared const
> array":
>
> https://lkml.kernel.org/87zgpte9o4.fsf@email.froward.int.ebiederm.org
>
> 67ff32289 ("proc_sysctl: update docs for __register_sysctl_table()")
Hey Wen Yang
Thx for that additional explanation.
So I understand that you want to get rid of the sysctl_vals shared const
array by replacing the extra{1,2} from void* to long; all this to *save
memory*. Right (please correct me if I have misunderstood you)?
I don't think that by doing this replacement you will save much memory
(if any). And this is why:
1. The long and the void* are most likely (depending on arch?) the same
size.
2. In [1] it is mentioned that, we would still need an extra (void*) to
address the sysctl tables that are *NOT* using extra{1,2} as min max.
This means that we need a bigger ctl_table (long extra1, long extra2
and void* extra). We will need *more* memory?
I would like to be proven wrong. So this is my proposal: Instead of
trying to do an incremental change, I suggest you remove the sysctl_vals
shared const array and measure how much memory you actually save. You
can use the ./scripts/bloat-o-meter in the linux kernel source and
follow something similar to what we did in [2] to measure how much
memory we are actually talking about.
Once you get a hard number, then we can move forward on the memory
saving front.
Best
[1] https://lkml.kernel.org/87zgpte9o4.fsf@email.froward.int.ebiederm.org
[2] https://lore.kernel.org/all/20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com/
--
Joel Granados
Powered by blists - more mailing lists