[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250225172445.GA28595@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>
Date: Tue, 25 Feb 2025 09:24:45 -0800
From: Shyam Saini <shyamsaini@...ux.microsoft.com>
To: Petr Pavlu <petr.pavlu@...e.com>
Cc: Rasmus Villemoes <linux@...musvillemoes.dk>,
linux-kernel@...r.kernel.org, linux-modules@...r.kernel.org,
code@...icks.com, christophe.leroy@...roup.eu, hch@...radead.org,
mcgrof@...nel.org, frkaya@...ux.microsoft.com,
vijayb@...ux.microsoft.com, linux@...ssschuh.net,
samitolvanen@...gle.com, da.gomez@...sung.com,
gregkh@...uxfoundation.org, rafael@...nel.org, dakr@...nel.org
Subject: Re: [PATCH v3 2/4] kernel: refactor lookup_or_create_module_kobject()
Hi Petr,
Thank you for the reviews
On Tue, Feb 25, 2025 at 09:33:10AM +0100, Petr Pavlu wrote:
> On 2/21/25 11:42, Rasmus Villemoes wrote:
> > On Thu, Feb 13 2025, Petr Pavlu <petr.pavlu@...e.com> wrote:
> >
> >> On 2/11/25 22:48, Shyam Saini wrote:
> >>> In the unlikely event of the allocation failing, it is better to let
> >>> the machine boot with a not fully populated sysfs than to kill it with
> >>> this BUG_ON(). All callers are already prepared for
> >>> lookup_or_create_module_kobject() returning NULL.
> >>>
> >>> This is also preparation for calling this function from non __init
> >>> code, where using BUG_ON for allocation failure handling is not
> >>> acceptable.
> >>
> >> I think some error reporting should be cleaned up here.
> >>
> >> The current situation is that locate_module_kobject() can fail in
> >> several cases and all these situations are loudly reported by the
> >> function, either by BUG_ON() or pr_crit(). Consistently with that, both
> >> its current callers version_sysfs_builtin() and kernel_add_sysfs_param()
> >> don't do any reporting if locate_module_kobject() fails; they simply
> >> return.
> >>
> >> The series seems to introduce two somewhat suboptimal cases.
> >>
> >> With this patch, when either version_sysfs_builtin() or
> >> kernel_add_sysfs_param() calls lookup_or_create_module_kobject() and it
> >> fails because of a potential kzalloc() error, the problem is silently
> >> ignored.
> >>
> >
> > No, because (IIUC) when a basic allocation via kzalloc fails, the kernel
> > complains loudly already; there's no reason for every caller of
> > kmalloc() and friends to add their own pr_err("kmalloc failed"), that
> > just bloats the kernel .text.
>
> I wasn't suggesting to log a kmalloc() error specifically. The idea was
> that if lookup_or_create_module_kobject() fails for whatever reason,
> kernel_add_sysfs_param() and version_sysfs_builtin() should log the
> failure to create/get the kobject.
>
> >
> >> Similarly, in the patch #4, when module_add_driver() calls
> >> lookup_or_create_module_kobject() and the function fails, the problem
> >> may or may not be reported, depending on the error.
> >>
> >> I'd suggest something as follows:
> >> * Drop the pr_crit() reporting in lookup_or_create_module_kobject().
> >
> > No, because that reports on something other than an allocation failing
> > (of course, it could be that the reason kobject_init_and_add or
> > sysfs_create_file failed was an allocation failure, but it could
> > also be something else), so reporting there is the right thing to do.
>
> The error message says:
> pr_crit("Adding module '%s' to sysfs failed (%d), the system may be unstable.\n", name, err);
>
> I think it was appropriate for locate_module_kobject() to do this error
> reporting when the function was only called by code in kernel/params.c
> and in the boot context. Now that the patch makes the function available
> to external callers, I'm not sure if it should remain reporting this
> error.
>
> The function itself shouldn't directly make the system unstable when it
> fails. Each caller can appropriately determine how to handle the error.
> Functions kernel_add_sysfs_param() and version_sysfs_builtin() don't
> have much of an option and can only log it, but module_add_driver()
> could roll back its operation.
>
before this patch series [1] kset_find_obj() was called in module_add_driver()
and the function immediately returned when no valid module_kobject was found,
module_add_driver returns immediately if some error is encountered or module_kobject
is not found in lookup_or_create_module_kobject()
Since module_kobject() is anyway no-op if it [2] returns early so it shouldn't require
any rollback, right?
Assuming rollback is not required for module_add_driver() in lookup_or_create_module_kobject()
failure scenario, what should be the appropriate messaging from pr_crit() if it
fails in module_add_driver()?
[1] https://elixir.bootlin.com/linux/v6.14-rc4/source/drivers/base/module.c#L48
[2] https://elixir.bootlin.com/linux/v6.14-rc4/source/drivers/base/module.c#L58
Thanks,
Shyam
Powered by blists - more mailing lists