[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87o6zxmlha.fsf@email.froward.int.ebiederm.org>
Date: Thu, 23 Jan 2025 12:30:25 -0600
From: "Eric W. Biederman" <ebiederm@...ssion.com>
To: Joel Granados <joel.granados@...nel.org>
Cc: Wen Yang <wen.yang@...ux.dev>, Luis Chamberlain <mcgrof@...nel.org>,
Kees Cook <keescook@...omium.org>, Christian Brauner
<brauner@...nel.org>, Dave Young <dyoung@...hat.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5] sysctl: simplify the min/max boundary check
"Eric W. Biederman" <ebiederm@...ssion.com> writes:
> Joel Granados <joel.granados@...nel.org> writes:
>
>> On Sun, Jan 19, 2025 at 10:59:21PM +0800, Wen Yang wrote:
>>>
>>>
>>> On 2025/1/16 17:37, Joel Granados wrote:
>>> > On Sun, Jan 05, 2025 at 11:28:53PM +0800, Wen Yang wrote:
>>> > > do_proc_dointvec_conv() used the default range of int type, while
>>> > > do_proc_dointvec_minmax_conv() additionally used "int * {min, max}" of
>>> > > struct do_proc_dointvec_minmax_conv_param, which are actually passed
>>> > > in table->extra{1,2} pointers.
>>> > >
>>> > > By changing the struct do_proc_dointvec_minmax_conv_param's
>>> > > "int * {min, max}" to "int {min, max}" and setting the min/max value
>>> > > via dereference the table->extra{1, 2}, those do_proc_dointvec_conv() and
>>> > > do_proc_dointvec_minmax_conv() functions can be merged into one and reduce
>>> > > code by one-third.
>>> > >
>>> > > Similar changes were also made for do_proc_douintvec_minmax_conv_param,
>>> > > do_proc_douintvec_conv() and do_proc_douintvec_minmax_conv().
>>> > >
>>> > Before moving forward, I need the motivation.
>>> >
>>> > Please answer this:
>>> > *Why* are you pushing these two [1], [2] series?
>>> >
>>> > Best
>>> >
>>> > [1]
>>> > * https://lore.kernel.org/tencent_A7FD9E9E75AA74A87583A427F5D20F988B09@qq.com
>>> > * https://lore.kernel.org/tencent_C5E6023F97E7CC2A046AAEA09BC9ACF43907@qq.com
>>> > * https://lore.kernel.org/cover.1726365007.git.wen.yang@linux.dev
>>> > * https://lore.kernel.org/cover.1726910671.git.wen.yang@linux.dev
>>> >
>>> > [2]
>>> > * https://lore.kernel.org/20240714173858.52163-1-wen.yang@linux.dev
>>> > * https://lore.kernel.org/20240715150223.54707-1-wen.yang@linux.dev
>>> > * https://lore.kernel.org/20240905134818.4104-1-wen.yang@linux.dev
>>> > * https://lore.kernel.org/20241201140058.5653-1-wen.yang@linux.dev
>>> >
>>>
>>> Thank you for your comments.
>>>
>>> Here is a brief report about it.
>>>
>>> The boundary check of sysctl uses .extra{1, 2} pointers, which need to point
>>> to constant variables (such as two_five_five, n_65535, ue_int_max, etc. that
>>> appear in multiple modules).
>>>
>>> We first try to stuff these variables into the shared const arrays (
>>> (sysctl_vals, sysctl_long_vals) to save memory, as shown in series 1.
>>>
>>> Thank Eric for pointing out:
>>> - You can still save a lot more by turning .extra1 and .extra2
>>> - into longs instead of keeping them as pointers and needing
>>> - constants to be pointed at somewhere.
>>>
>>> We have further found more detailed information about "kill the shared const
>>> array":
>>>
>>> https://lkml.kernel.org/87zgpte9o4.fsf@email.froward.int.ebiederm.org
>>>
>>> 67ff32289 ("proc_sysctl: update docs for __register_sysctl_table()")
>>
>> Hey Wen Yang
>>
>> Thx for that additional explanation.
>>
>> So I understand that you want to get rid of the sysctl_vals shared const
>> array by replacing the extra{1,2} from void* to long; all this to *save
>> memory*. Right (please correct me if I have misunderstood you)?
>>
>> I don't think that by doing this replacement you will save much memory
>> (if any). And this is why:
>> 1. The long and the void* are most likely (depending on arch?) the same
>> size.
>> 2. In [1] it is mentioned that, we would still need an extra (void*) to
>> address the sysctl tables that are *NOT* using extra{1,2} as min max.
>> This means that we need a bigger ctl_table (long extra1, long extra2
>> and void* extra). We will need *more* memory?
>>
>> I would like to be proven wrong. So this is my proposal: Instead of
>> trying to do an incremental change, I suggest you remove the sysctl_vals
>> shared const array and measure how much memory you actually save. You
>> can use the ./scripts/bloat-o-meter in the linux kernel source and
>> follow something similar to what we did in [2] to measure how much
>> memory we are actually talking about.
>>
>> Once you get a hard number, then we can move forward on the memory
>> saving front.
>
> When I originally suggested this my motivation had nothing to do with
> memory. The sysctl_vals memory array is type unsafe and has actively
> lead to real world bugs. AKA longs and int confusion. One example is
> that SYSCTL_ZERO does not properly work as a minimum to
> proc_do_ulongvec_minmax.
>
> Frankly those SYSCTL_XXX macros that use sysctl_vals are just plain
> scary to work with.
>
> So I suggested please making everything simpler by putting unsigned long
> min and max in to struct ctl_table and then getting rid of extra1 and
> extra2. As extra1 and extra2 are almost exclusively used to implement
> min and max.
>
> The size would not change but the chance of making a hard to find and
> understand mistake would go down dramatically.
Oh. I forgot to add it was also about the usability of the SYSCTL_XXX
macros.
As it is now whenever there is a new interesting value you have to add
another SYSCTL_XXX macro, and place your value in sysctl_vals.
Compared to just having a min or a max value in struct ctl_table it
is much more work.
Where size comes into my original suggestion is that because extra1 and
extra2 can and probably should go away, adding min and max members to
struct ctl_table can be done at no cost in size.
Eric
Powered by blists - more mailing lists