[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <957f038b-4d68-468c-9e14-9c38f1ed6a2f@suse.com>
Date: Thu, 27 Feb 2025 14:08:31 +0100
From: Petr Pavlu <petr.pavlu@...e.com>
To: Shyam Saini <shyamsaini@...ux.microsoft.com>
Cc: Rasmus Villemoes <linux@...musvillemoes.dk>,
linux-kernel@...r.kernel.org, linux-modules@...r.kernel.org,
code@...icks.com, christophe.leroy@...roup.eu, hch@...radead.org,
mcgrof@...nel.org, frkaya@...ux.microsoft.com, vijayb@...ux.microsoft.com,
linux@...ssschuh.net, samitolvanen@...gle.com, da.gomez@...sung.com,
gregkh@...uxfoundation.org, rafael@...nel.org, dakr@...nel.org
Subject: Re: [PATCH v3 2/4] kernel: refactor lookup_or_create_module_kobject()
On 2/25/25 18:24, Shyam Saini wrote:
> On Tue, Feb 25, 2025 at 09:33:10AM +0100, Petr Pavlu wrote:
>> On 2/21/25 11:42, Rasmus Villemoes wrote:
>>> On Thu, Feb 13 2025, Petr Pavlu <petr.pavlu@...e.com> wrote:
>>>
>>>> On 2/11/25 22:48, Shyam Saini wrote:
>>>>> In the unlikely event of the allocation failing, it is better to let
>>>>> the machine boot with a not fully populated sysfs than to kill it with
>>>>> this BUG_ON(). All callers are already prepared for
>>>>> lookup_or_create_module_kobject() returning NULL.
>>>>>
>>>>> This is also preparation for calling this function from non __init
>>>>> code, where using BUG_ON for allocation failure handling is not
>>>>> acceptable.
>>>>
>>>> I think some error reporting should be cleaned up here.
>>>>
>>>> The current situation is that locate_module_kobject() can fail in
>>>> several cases and all these situations are loudly reported by the
>>>> function, either by BUG_ON() or pr_crit(). Consistently with that, both
>>>> its current callers version_sysfs_builtin() and kernel_add_sysfs_param()
>>>> don't do any reporting if locate_module_kobject() fails; they simply
>>>> return.
>>>>
>>>> The series seems to introduce two somewhat suboptimal cases.
>>>>
>>>> With this patch, when either version_sysfs_builtin() or
>>>> kernel_add_sysfs_param() calls lookup_or_create_module_kobject() and it
>>>> fails because of a potential kzalloc() error, the problem is silently
>>>> ignored.
>>>>
>>>
>>> No, because (IIUC) when a basic allocation via kzalloc fails, the kernel
>>> complains loudly already; there's no reason for every caller of
>>> kmalloc() and friends to add their own pr_err("kmalloc failed"), that
>>> just bloats the kernel .text.
>>
>> I wasn't suggesting to log a kmalloc() error specifically. The idea was
>> that if lookup_or_create_module_kobject() fails for whatever reason,
>> kernel_add_sysfs_param() and version_sysfs_builtin() should log the
>> failure to create/get the kobject.
>>
>>>
>>>> Similarly, in the patch #4, when module_add_driver() calls
>>>> lookup_or_create_module_kobject() and the function fails, the problem
>>>> may or may not be reported, depending on the error.
>>>>
>>>> I'd suggest something as follows:
>>>> * Drop the pr_crit() reporting in lookup_or_create_module_kobject().
>>>
>>> No, because that reports on something other than an allocation failing
>>> (of course, it could be that the reason kobject_init_and_add or
>>> sysfs_create_file failed was an allocation failure, but it could
>>> also be something else), so reporting there is the right thing to do.
>>
>> The error message says:
>> pr_crit("Adding module '%s' to sysfs failed (%d), the system may be unstable.\n", name, err);
>>
>> I think it was appropriate for locate_module_kobject() to do this error
>> reporting when the function was only called by code in kernel/params.c
>> and in the boot context. Now that the patch makes the function available
>> to external callers, I'm not sure if it should remain reporting this
>> error.
>>
>> The function itself shouldn't directly make the system unstable when it
>> fails. Each caller can appropriately determine how to handle the error.
>> Functions kernel_add_sysfs_param() and version_sysfs_builtin() don't
>> have much of an option and can only log it, but module_add_driver()
>> could roll back its operation.
>>
>
> before this patch series [1] kset_find_obj() was called in module_add_driver()
> and the function immediately returned when no valid module_kobject was found,
>
> module_add_driver returns immediately if some error is encountered or module_kobject
> is not found in lookup_or_create_module_kobject()
> Since module_kobject() is anyway no-op if it [2] returns early so it shouldn't require
> any rollback, right?
>
> Assuming rollback is not required for module_add_driver() in lookup_or_create_module_kobject()
> failure scenario, what should be the appropriate messaging from pr_crit() if it
> fails in module_add_driver()?
>
> [1] https://elixir.bootlin.com/linux/v6.14-rc4/source/drivers/base/module.c#L48
> [2] https://elixir.bootlin.com/linux/v6.14-rc4/source/drivers/base/module.c#L58
Sorry, I partly misunderstood the different context in which
module_add_driver() is called. I still think my suggestion makes some
sense, but looking again, the current version seems ok to me too.
--
Thanks,
Petr
Powered by blists - more mailing lists