[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2a33d478-b7a8-5b3c-7bc5-f33eb27b44fa@rasmusvillemoes.dk>
Date: Thu, 11 Mar 2021 01:17:56 +0100
From: Rasmus Villemoes <linux@...musvillemoes.dk>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Luis Chamberlain <mcgrof@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Jessica Yu <jeyu@...nel.org>, Borislav Petkov <bp@...en8.de>,
Jonathan Corbet <corbet@....net>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Nick Desaulniers <ndesaulniers@...gle.com>
Subject: Re: [PATCH v2 1/2] init/initramfs.c: allow asynchronous unpacking
On 09/03/2021 23.16, Linus Torvalds wrote:
> On Tue, Mar 9, 2021 at 1:17 PM Rasmus Villemoes
> <linux@...musvillemoes.dk> wrote:
>>
>> So add an initramfs_async= kernel parameter, allowing the main init
>> process to proceed to handling device_initcall()s without waiting for
>> populate_rootfs() to finish.
>
> Oh, and a completely unrelated second comment about this: some of the
> initramfs population code seems to be actively written to be slow.
>
> For example, I'm not sure why that write_buffer() function uses an
> array of indirect function pointer actions. Even ignoring the whole
> "speculation protections make that really slow" issue that came later,
> it seems to always have been actively (a) slower and (b) more complex.
>
> The normal way to do that is with a simple switch() statement, which
> makes the compiler able to do a much better job. Not just for the
> state selector - maybe it picks that function pointer approach, but
> probably these days just direct comparisons - but simply to do things
> like inline all those "it's used in one place" cases entirely. In
> fact, at that point, a lot of the state machine changes may end up
> turning into pure goto's - compilers are sometimes good at that
> (because state machines have often been very timing-critical).
FWIW, I tried doing
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -401,24 +401,26 @@ static int __init do_symlink(void)
return 0;
}
-static __initdata int (*actions[])(void) = {
- [Start] = do_start,
- [Collect] = do_collect,
- [GotHeader] = do_header,
- [SkipIt] = do_skip,
- [GotName] = do_name,
- [CopyFile] = do_copy,
- [GotSymlink] = do_symlink,
- [Reset] = do_reset,
-};
-
static long __init write_buffer(char *buf, unsigned long len)
{
+ int ret;
+
byte_count = len;
victim = buf;
- while (!actions[state]())
- ;
+ do {
+ switch (state) {
+ case Start: ret = do_start(); break;
+ case Collect: ret = do_collect(); break;
+ case GotHeader: ret = do_header(); break;
+ case SkipIt: ret = do_skip(); break;
+ case GotName: ret = do_name(); break;
+ case CopyFile: ret = do_copy(); break;
+ case GotSymlink: ret = do_symlink(); break;
+ case Reset: ret = do_reset(); break;
+ }
+ } while (!ret);
+
return len - byte_count;
}
and yes, all the do_* functions get inlined into write_buffer with some
net space saving:
add/remove: 0/9 grow/shrink: 1/0 up/down: 1696/-2112 (-416)
Function old new delta
write_buffer 100 1796 +1696
actions 32 - -32
do_start 52 - -52
do_reset 112 - -112
do_skip 128 - -128
do_collect 144 - -144
do_symlink 172 - -172
do_copy 304 - -304
do_header 572 - -572
do_name 596 - -596
Total: Before=5360, After=4944, chg -7.76%
(ppc32 build). But the unpacking still takes the same time. It might be
different on x86.
Rasmus
Powered by blists - more mailing lists