linux-kernel - Re: [PATCH v2 2/2] module: Merge same-name module load requests

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <94cd5565-1058-2c97-57bb-0ddf12416cd6@redhat.com>
Date:   Mon, 28 Nov 2022 11:29:48 -0500
From:   Prarit Bhargava <prarit@...hat.com>
To:     David Hildenbrand <david@...hat.com>,
        Luis Chamberlain <mcgrof@...nel.org>
Cc:     pmladek@...e.com, Petr Pavlu <petr.pavlu@...e.com>,
        linux-modules@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/2] module: Merge same-name module load requests

On 11/14/22 10:45, David Hildenbrand wrote:
> On 14.11.22 16:38, Luis Chamberlain wrote:
>> On Mon, Nov 14, 2022 at 09:57:56AM +0100, David Hildenbrand wrote:
>>> On 12.11.22 02:47, Luis Chamberlain wrote:
>>>> On Wed, Oct 19, 2022 at 02:00:55PM +0200, Petr Pavlu wrote:
>>>>> On 10/18/22 20:33, Luis Chamberlain wrote:
>>>>>> On Sat, Oct 15, 2022 at 11:27:10AM +0200, Petr Pavlu wrote:
>>>>>>> The patch does address a regression observed after commit 
>>>>>>> 6e6de3dee51a
>>>>>>> ("kernel/module.c: Only return -EEXIST for modules that have 
>>>>>>> finished
>>>>>>> loading"). I guess it can have a Fixes tag added to the patch.
>>>>>>>
>>>>>>> I think it is hard to split this patch into parts because the 
>>>>>>> implemented
>>>>>>> "optimization" is the fix.
>>>>>>
>>>>>> git describe --contains 6e6de3dee51a
>>>>>> v5.3-rc1~38^2~6
>>>>>>
>>>>>> I'm a bit torn about this situation. Reverting 6e6de3dee51a would 
>>>>>> be the
>>>>>> right thing to do, but without it, it still leaves the issue reported
>>>>>> by Prarit Bhargava. We need a way to resolve the issue on stable and
>>>>>> then your optimizations can be applied on top.
>>>>>
>>>>> Simpler could be to do the following:
>>>>>
>>>>> diff --git a/kernel/module/main.c b/kernel/module/main.c
>>>>> index d02d39c7174e..0302ac387e93 100644
>>>>> --- a/kernel/module/main.c
>>>>> +++ b/kernel/module/main.c
>>>>> @@ -2386,7 +2386,8 @@ static bool finished_loading(const char *name)
>>>>>        sched_annotate_sleep();
>>>>>        mutex_lock(&module_mutex);
>>>>>        mod = find_module_all(name, strlen(name), true);
>>>>> -    ret = !mod || mod->state == MODULE_STATE_LIVE;
>>>>> +    ret = !mod || mod->state == MODULE_STATE_LIVE
>>>>> +        || mod->state == MODULE_STATE_GOING;
>>>>>        mutex_unlock(&module_mutex);
>>>>>        return ret;
>>>>> @@ -2566,7 +2567,8 @@ static int add_unformed_module(struct module 
>>>>> *mod)
>>>>>        mutex_lock(&module_mutex);
>>>>>        old = find_module_all(mod->name, strlen(mod->name), true);
>>>>>        if (old != NULL) {
>>>>> -        if (old->state != MODULE_STATE_LIVE) {
>>>>> +        if (old->state == MODULE_STATE_COMING
>>>>> +            || old->state == MODULE_STATE_UNFORMED) {
>>>>>                /* Wait in case it fails to load. */
>>>>>                mutex_unlock(&module_mutex);
>>>>>                err = wait_event_interruptible(module_wq,
>>>>> @@ -2575,7 +2577,7 @@ static int add_unformed_module(struct module 
>>>>> *mod)
>>>>>                    goto out_unlocked;
>>>>>                goto again;
>>>>>            }
>>>>> -        err = -EEXIST;
>>>>> +        err = old->state != MODULE_STATE_LIVE ? -EBUSY : -EEXIST;
>>>>>            goto out;
>>>>>        }
>>>>>        mod_update_bounds(mod);
>>>>
>>>>
>>>> Prarit, can you verify this still does not break the issue you 
>>>> reported?
>>>> David, does this also fix your issue?
>>>
>>> I didn't try, but from a quick glimpse I assume no. Allocating module 
>>> space
>>> happens before handling eventual duplicates right now, before a 
>>> module even
>>> is "alive" and in the MODULE_STATE_UNFORMED state.
>>
>> The first two hunks are a revert of commit 6e6de3dee51a and I'm under
>> the impression that cauased your issues as *more* modules states are
>> allowed through.
>>
>> The last hunk tries to fix what 6e6de3dee51a wanted to do.
>>
> 
> Note that I don't think the issue I raised is due to 6e6de3dee51a.
> 
>>> But maybe I am missing something important.
>>
>> Please do test if you can.
> 
> I don't have the machine at hand right now. But, again, I doubt this 
> will fix it.
> 
> 
> The flow is in load_module():
> 
>      mod = layout_and_allocate(info, flags);
>      if (IS_ERR(mod)) {
>          ...
>      }
> 
>      audit_log_kern_module(mod->name);
> 
>      /* Reserve our place in the list. */
>      err = add_unformed_module(mod);
>      if (err)
>          goto free_module;
> 
> 
> You can have 400 threads in layout_and_allocate() loading the same 
> module at the same time and running out of module space. Any changes to 
> add_unformed_module() and finished_loading() won't change that, because 
> they are not involved before the module space allocations happened.
> 

I'd like to see a refreshed patch but I tested the latest version and 
see that the boot time is LONGER with the change

Before:

[11:17 AM root@...el-eaglestream-spr-15 kernel-ark]# systemd-analyze
Startup finished in 55.418s (firmware) + 22.766s (loader) + 35.856s 
(kernel) + 5.830s (initrd) + 15.671s (userspace) = 2min 15.542s
multi-user.target reached after 15.606s in userspace.

After:

Startup finished in 55.314s (firmware) + 23.033s (loader) + 35.331s 
(kernel) + 5.176s (initrd) + 23.465s (userspace) = 2min 22.320s
multi-user.target reached after 23.093s in userspace.

Subsequent reboots also indicate that userspace boot time is longer 
after the change.

P.