[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wgWCDw58fZDLGYVqVC2ee-Zec25unewdHFp8syCZFumvg@mail.gmail.com>
Date: Thu, 25 May 2023 11:50:21 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Luis Chamberlain <mcgrof@...nel.org>
Cc: Linux FS Devel <linux-fsdevel@...r.kernel.org>, hch@....de,
brauner@...nel.org, david@...hat.com, tglx@...utronix.de,
patches@...ts.linux.dev, linux-modules@...r.kernel.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org, pmladek@...e.com,
petr.pavlu@...e.com, prarit@...hat.com, lennart@...ttering.net,
gregkh@...uxfoundation.org, rafael@...nel.org, song@...nel.org,
lucas.de.marchi@...il.com, lucas.demarchi@...el.com,
christophe.leroy@...roup.eu, peterz@...radead.org, rppt@...nel.org,
dave@...olabs.net, willy@...radead.org, vbabka@...e.cz,
mhocko@...e.com, dave.hansen@...ux.intel.com,
colin.i.king@...il.com, jim.cromie@...il.com,
catalin.marinas@....com, jbaron@...mai.com,
rick.p.edgecombe@...el.com, yujie.liu@...el.com
Subject: Re: [PATCH 1/2] fs/kernel_read_file: add support for duplicate detection
On Thu, May 25, 2023 at 11:08 AM Luis Chamberlain <mcgrof@...nel.org> wrote:
>
> Certainly on the track where I wish we could go. Now this goes tested.
> On 255 cores:
>
> Before:
>
> vagrant@...d ~ $ sudo systemd-analyze
> Startup finished in 41.653s (kernel) + 44.305s (userspace) = 1min 25.958s
> graphical.target reached after 44.178s in userspace.
>
> root@...d ~ # grep "Virtual mem wasted bytes" /sys/kernel/debug/modules/stats
> Virtual mem wasted bytes 1949006968
>
>
> ; 1949006968/1024/1024/1024
> ~1.81515418738126754761
>
> So ~1.8 GiB... of vmalloc space wasted during boot.
>
> After:
>
> systemd-analyze
> Startup finished in 24.438s (kernel) + 41.278s (userspace) = 1min 5.717s
> graphical.target reached after 41.154s in userspace.
>
> root@...d ~ # grep "Virtual mem wasted bytes" /sys/kernel/debug/modules/stats
> Virtual mem wasted bytes 354413398
>
> So still 337.99 MiB of vmalloc space wasted during boot due to
> duplicates.
Ok. I think this will count as 'good enough for mitigation purposes'
> The reason is the exclusive_deny_write_access() must be
> kept during the life of the module otherwise as soon as it is done
> others can still race to load
Yes. The exclusion only applies while the file is actively being read.
> So with two other hunks added (2nd and 4th), this now matches parity with
> my patch, not suggesting this is right,
Yeah, we can't do that, because user space may quite validly want to
write the file afterwards.
Or, in fact, unload the module and re-load it.
So the "exclusion" really needs to be purely temporary.
That said, I considered moving the exclusion to module/main.c itself,
rather than the reading part. That wouild get rid of the hacky "id ==
READING_MODULE", and put the exclusion in the place that actually
wants it.
And that would allow us to at least extend that temporary exlusion a
bit - we could keep it until the module has actually been loaded and
inited.
So it would probably improve on those numbers a bit more, but you'd
still have the fundamental race where *serial* duplicates end up
always wasting CPU effort and temporary vmalloc space.
Linus
Powered by blists - more mailing lists