[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e861fc51-f244-4645-af72-56416a422060@linux.dev>
Date: Thu, 30 Jan 2025 22:32:14 +0800
From: Wen Yang <wen.yang@...ux.dev>
To: "Eric W. Biederman" <ebiederm@...ssion.com>,
Joel Granados <joel.granados@...nel.org>
Cc: Luis Chamberlain <mcgrof@...nel.org>, Kees Cook <keescook@...omium.org>,
Christian Brauner <brauner@...nel.org>, Dave Young <dyoung@...hat.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5] sysctl: simplify the min/max boundary check
On 2025/1/28 01:51, Eric W. Biederman wrote:
> Joel Granados <joel.granados@...nel.org> writes:
>
>> On Thu, Jan 23, 2025 at 12:30:25PM -0600, Eric W. Biederman wrote:
>>> "Eric W. Biederman" <ebiederm@...ssion.com> writes:
>>>
>>>> Joel Granados <joel.granados@...nel.org> writes:
>>>>
>>>>> On Sun, Jan 19, 2025 at 10:59:21PM +0800, Wen Yang wrote:
>>>>>>
>>>>>>
>>>>>> On 2025/1/16 17:37, Joel Granados wrote:
>>>>>>> On Sun, Jan 05, 2025 at 11:28:53PM +0800, Wen Yang wrote:
>>>>>>>> do_proc_dointvec_conv() used the default range of int type, while
>>>>>>>> do_proc_dointvec_minmax_conv() additionally used "int * {min, max}" of
>>>>>>>> struct do_proc_dointvec_minmax_conv_param, which are actually passed
>>>>>>>> in table->extra{1,2} pointers.
>> ...
>>>>> (if any). And this is why:
>>>>> 1. The long and the void* are most likely (depending on arch?) the same
>>>>> size.
>>>>> 2. In [1] it is mentioned that, we would still need an extra (void*) to
>>>>> address the sysctl tables that are *NOT* using extra{1,2} as min max.
>>>>> This means that we need a bigger ctl_table (long extra1, long extra2
>>>>> and void* extra). We will need *more* memory?
>>>>>
>>>>> I would like to be proven wrong. So this is my proposal: Instead of
>>>>> trying to do an incremental change, I suggest you remove the sysctl_vals
>>>>> shared const array and measure how much memory you actually save. You
>>>>> can use the ./scripts/bloat-o-meter in the linux kernel source and
>>>>> follow something similar to what we did in [2] to measure how much
>>>>> memory we are actually talking about.
>>>>>
>>>>> Once you get a hard number, then we can move forward on the memory
>>>>> saving front.
>>
>> Hey Eric.
>>
>> Thx for the clarification. Much appreciated.
>>>>
>>>> When I originally suggested this my motivation had nothing to do with memory
>> That makes a *lot* of sense :).
>>
>>>> The sysctl_vals memory array is type unsafe and has actively
>> Here I understand that they are unsafe because of Integer promotion
>> issues exacerbated by the void* variables extra{1,2}. Please correct me
>> If I missed the point.
>
> Not precisely. It is because the (void *) pointers are silently cast to
> either (int *) or (long *) pointers. So for example passing SYSCTL_ZERO
> to proc_do_ulongvec_minmax results in reading sysctl_vals[0] and
> sysctl_vals[1] and to get the long value. Since sysctl_vals[1] is 1
> a 0 is not accepted because 0 is below the minimum.
>
> The minimum value that is accepted depends on which architecture you are
> on. On x86_64 and other little endian architectures the minimum value
> accepted is 0x0000000100000000. On big endian architectures like mips64
> the minimum value accepted winds up being 0x0000000000000001. Or do I
> have that backwards?
>
> It doesn't matter because neither case is what the programmer expected.
> Further it means that keeping the current proc_do_ulongvec_minmax and
> proc_do_int_minmax methods that it is impossible to define any of the
> SYSCTL_XXX macros except SYSCTL_ZERO that will work with both methods.
>
>> There is also the fact that you can just do a `extra1 = &sysctl_vals[15]`
>> and the compiler will not bark at you. At least It let me do that on my
>> side.
>
> All of which in the simplest for has me think the SYSCTL_XXX cleanups
> were a step in the wrong direction.
>
>>>> lead to real world bugs. AKA longs and int confusion. One example is
>>>> that SYSCTL_ZERO does not properly work as a minimum to
>>>> proc_do_ulongvec_minmax.
>> That is a great example.
>>
>>>>
>>>> Frankly those SYSCTL_XXX macros that use sysctl_vals are just plain
>>>> scary to work with.
>> I share your feeling :)
>>
>>>>
>>>> So I suggested please making everything simpler by putting unsigned long
>>>> min and max in to struct ctl_table and then getting rid of extra1 and
>>>> extra2. As extra1 and extra2 are almost exclusively used to implement
>>>> min and max.
>> Explicitly specifying the type will help reduce the "unsefeness" but
>> with all the ways that there are of using these pointers, I think we
>> need to think bigger and maybe try to find a more typesafe way to
>> represent all the interactions.
>>
>> It has always struck me as strange the arbitrariness of having 2 extra
>> pointers. Why not just one?
>
> Which would be the void *data pointer.
>
>> At the end it is a pointer and can point to
>> a struct that holds min, max... I do not have the answer yet, but I
>> think what you propose here is part of a bigger refactoring needed in
>> ctl_table structure. Would like to hear your thought on it if you have
>> any.
>
> One of the things that happens and that is worth acknowledging is there
> is code that wraps proc_doulongvec_minmax and proc_dointvec_minmax.
> Having the minmax information separate from the data pointer makes that
> wrapping easier.
>
> Further the min/max information is typically separate from other kinds
> of data. So even when not wrapped it is nice just to take a quick
> glance and see what the minimums and maximums are.
>
> My original suggest was that we change struct ctl_table from:
>
>> /* A sysctl table is an array of struct ctl_table: */
>> struct ctl_table {
>> const char *procname; /* Text ID for /proc/sys */
>> void *data;
>> int maxlen;
>> umode_t mode;
>> proc_handler *proc_handler; /* Callback for text formatting */
>> struct ctl_table_poll *poll;
>> void *extra1;
>> void *extra2;
>> } __randomize_layout;
>
> to:
>
>> /* A sysctl table is an array of struct ctl_table: */
>> struct ctl_table {
>> const char *procname; /* Text ID for /proc/sys */
>> void *data;
>> int maxlen;
>> umode_t mode;
>> proc_handler *proc_handler; /* Callback for text formatting */
>> struct ctl_table_poll *poll;
>> unsigned long min;
>> unsigned long max;
>> } __randomize_layout;
>
> That is just replace extra1 and extra2 with min and max members. The
> members don't have any reason to be pointers. Without being pointers
> the min/max functions can just use long values to cap either ints or
> longs, and there is no room for error. The integer promotion rules
> will ensure that even negative values can be stored in unsigned long
> min and max values successfully. Plus it is all bog standard C
> so there is nothing special to learn.
>
> There are a bunch of fiddly little details to transition from where we
> are today. The most straightforward way I can see of making the
> transition is to add the min and max members. Come up with replacements
> for proc_doulongvec_minmax and proc_dointvec_minmax that read the new
> min and max members. Update all of the users. Update the few users
> that use extra1 or extra2 for something besides min and max. Then
> remove extra1 and extra2. At the end it is simpler and requires the
> same or a little less space.
>
> That was and remains my suggestion.
>
Thanks for your valuable suggestions. We will continue to move forward
along it and need your more guidance.
But there are also a few codes that do take the extra{1, 2} as pointers,
for example:
int neigh_sysctl_register(struct net_device *dev, struct neigh_parms *p,
proc_handler *handler)
{
...
for (i = 0; i < NEIGH_VAR_GC_INTERVAL; i++) {
t->neigh_vars[i].data += (long) p;
t->neigh_vars[i].extra1 = dev;
t->neigh_vars[i].extra2 = p;
}
...
}
static void neigh_proc_update(const struct ctl_table *ctl, int write)
{
struct net_device *dev = ctl->extra1;
struct neigh_parms *p = ctl->extra2;
struct net *net = neigh_parms_net(p);
int index = (int *) ctl->data - p->data;
...
}
So could we modify it in this way to make it compatible with these two
situations:
@@ -137,8 +137,16 @@ struct ctl_table {
umode_t mode;
proc_handler *proc_handler; /* Callback for text formatting */
struct ctl_table_poll *poll;
- void *extra1;
- void *extra2;
+ union {
+ struct {
+ void *extra1;
+ void *extra2;
+ };
+ struct {
+ unsigned long min;
+ unsigned long max;
+ };
+ };
} __randomize_layout;
--
Best wishes,
Wen
Powered by blists - more mailing lists