[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1263c092-d4bc-4d8f-8ef8-2706d337f4c2@gmail.com>
Date: Sat, 18 Oct 2025 17:52:01 +0100
From: Mehdi Ben Hadj Khelifa <mehdi.benhadjkhelifa@...il.com>
To: Shuah Khan <skhan@...uxfoundation.org>, akpm@...ux-foundation.org
Cc: linux-kernel@...r.kernel.org, david.hunter.linux@...il.com,
linux-kernel-mentees@...ts.linuxfoundation.org, khalid@...nel.org
Subject: Re: [PATCH] lib: cpu_rmap.c Refactor allocation size calculation in
kzalloc()
On 10/10/25 6:00 PM, Shuah Khan wrote:
> On 10/9/25 09:16, Mehdi Ben Hadj Khelifa wrote:
>> On 10/7/25 11:23 PM, Shuah Khan wrote:
>>
>>>
>>> How did you find this problem and how did you test this change?
>
> Bummer - you trimmed the code entirely from the thread. Next time
> leave it in for context for the discussion.
>
Ah, I saw in other LKMLs that some do delete the code so I thought it
was okay. We'll do next time.>> For the first part of your
question,After simply referring to
>> deprecated documentation[1] which states the following:
>
> Looks you forgot to add link to the deprecated documentation[1].
> It sounds like this is a potential problem without a reproducer.
> These types of problems made to a critical piece of code require
> substantial testing.
>
Ack, This is the doc that I was referencing:
https://docs.kernel.org/process/deprecated.html
I'm not sure what is exactly demanded in substantial testing.My guess
was to do normal testing as I mentionned and add some fault injection to
test the change in case of failure and also compare dmesg outputs.I have
run selftests for the net subsystem too since my last mail with no sign
of regression.Any suggestions on what testing for this case should look
like instead or on top of what I did?>> 'For other calculations, please
compose the use of the size_mul(),
>> size_add(), and size_sub() helpers'
>> Which is about dynamic calculations made inside of kzalloc() and
>> kmalloc(). Specifically, the quoted part is talking about calculations
>> which can't be simply divided into two parameters referring to the
>> number of elements and size per element and in cases where we can't
>> use struct_size() too.After that it was a matter of finding code where
>> that could be the problem which is the case of the changed code.
>>
>> For the second part, As per any patch,I make a copy of all dmesg
>> warnings errors critical messages,then I compile install and boot the
>> new kernel then check if there is any change or regression in dmesg.
>
> This is a basic boot test which isn't sufficient in this case.
>
>> For this particular change, since it doesn't have any selftests
>> because it's in utility library which in my case cpu_rmap is used in
>> the networking subsystem, I did some fault injection with a custom
>> module to test if in case of overflow it fails safely reporting the
>> issue in dmesg which is catched by the __alloc_frozen_pages_noprof()
>> function in mm/page_alloc.c and also return a NULL for rmap instead of
>> wrapping to a smaller size.
>
> Custom module testing doesn't test this change in a wider scope
> which is necessary when you are making changes such as these
> without a reproducer and a way to reproduce. How do you know
> this change doesn't introduce regressions?
>
My custom module testing specifically tested the change in case of
failure which is what the change is for in the first place.The change
which deems to be simple in the documentation since we are just wrapping
calculations instead of using operators,is just to safe guard
calculations that are made inside of kzalloc() so that no unwanted
behavior is produced i.e in case of overflow.As I mentionned above,I
tested regressions by running selftests for net subsystem with it
showing no regressions on top of fault injection mentionned.
I would like to have more guidance as to what I could do to have more
robust testing in this case.> thanks,
> -- Shuah
Regards,
Mehdi Ben Hadj Khelifa
Powered by blists - more mailing lists