[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1b73bc5a-1948-4e67-9ec5-b238723b3a48@gmx.de>
Date: Fri, 22 Dec 2023 13:13:26 +0100
From: Helge Deller <deller@....de>
To: Luis Chamberlain <mcgrof@...nel.org>, deller@...nel.org
Cc: linux-kernel@...r.kernel.org, Masahiro Yamada <masahiroy@...nel.org>,
Arnd Bergmann <arnd@...db.de>, linux-modules@...r.kernel.org,
linux-arch@...r.kernel.org
Subject: Re: [PATCH 2/4] modules: Ensure 64-bit alignment on __ksymtab_*
sections
Hi Luis,
On 12/22/23 06:59, Luis Chamberlain wrote:
> On Wed, Nov 22, 2023 at 11:18:12PM +0100, deller@...nel.org wrote:
>> From: Helge Deller <deller@....de>
>>
>> On 64-bit architectures without CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
>> (e.g. ppc64, ppc64le, parisc, s390x,...) the __KSYM_REF() macro stores
>> 64-bit pointers into the __ksymtab* sections.
>> Make sure that those sections will be correctly aligned at module link time,
>> otherwise unaligned memory accesses may happen at runtime.
>
> The ramifications are not explained there.
What do you mean exactly by this?
> You keep sending me patches with this and we keep doing a nose dive
> on this. It means I have to do more work.
Sorry about that. Since you are mentioned as maintainer for modules I
thought you are the right contact.
Furthermore, this patch is pretty small and trivial.
And I had the impression that people understand that having unaligned
structures is generally a bad idea as it usually always impacts performance
(although the performance penalty in this case isn't critical at all,
as we are not on hot paths).
> So as I had suggested with your patch which I merged in
> commit 87c482bdfa79 ("modules: Ensure natural alignment for
> .altinstructions and __bug_table sections") please clarify the
> impact of not merging this patch. Also last time you noticed the
> misalignment due to a faulty exception handler, please mention how
> you found this out now.
Sure.
> And since this is not your first patch on the exact related subject
> I'd appreciate if you can show me perf stat results differences between
> having and not having this patch merged. Why? Because we talk about
> a performance penalthy, but we are not saying how much, and since this
> is an ongoing thing, we might as well have a tool belt with ways to
> measure such performance impact to bring clarity and value to this
> and future related patches.
>
>> The __kcrctab* sections store 32-bit entities, so use ALIGN(4) for those.
>
> I've given some thought about how to test this. Sadly perf kallsysms
> just opens the /proc/kallsysms file, but that's fine, we need our own
> test.
>
> I think a 3 new simple modules selftest would do it and running perf
> stat on it. One module, let us call it module A which constructs its own
> name space prefix for its exported symbols and has tons of silly symbols
> for arbitrary data, whatever. We then have module B which refers to a
> few arbitrary symbols from module A, hopefully spread out linearly, so
> if module A had 10,000 symbols, we'd have module A refer to a symbol
> ever 1,000 symbols. Finally we want a symbol C which has say, 50,000
> symbols all of which will not be used at all by the first two modules,
> but the selftest will load module C first, prior to calling modprobe B.
>
> We'll stress test this way two calls which use find_symbol():
>
> 1) Upon load of B it will trigger simplify_symbols() to look for the
> symbol it uses from the module A with tons of symbols. That's an
> indirect way for us to call resolve_symbol_wait() from module A without
> having to use symbol_get() which want to remove as exported as it is
> just a hack which should go away. Our goal is for us to test
> resolve_symbol() which will call find_symbol() and that will eventually
> look for the symbol on module A with:
>
> find_exported_symbol_in_section()
>
> That uses bsearch() so a binary search for the symbol and we'd end up
> hitting the misalignments here. Binary search will at worst be O(log(n))
> and so the only way to aggreviate the search will be to add tons of
> symbols to A, and have B use a few of them.
>
> 2) When you load B, userspace will at first load A as depmod will inform
> userspace A goes before B. Upon B's load towards the end right before
> we call module B's init routine we get complete_formation() called on
> the module. That will first check for duplicate symbols with the call
> to verify_exported_symbols(). That is when we'll force iteration on
> module C's insane symbol list.
>
> The selftests just runs
>
> perf stat -e pick-your-poison-for-misalignments tools/testing/selftests/kmod/ksymtab.sh
>
> Where ksymtab.sh is your new script which calls:
>
> modprobe C
> modprobe B
>
> I say pick-your-poison-for-misalignments because I am not sure what is
> best here.
>
> Thoughts?
I really appreciate your thoughts here!
But given the triviality of this patch, I personally think you are
proposing a huge academic investment in time and resources with no real benefit.
On which architecture would you suggest to test this?
What would be the effective (really new) outcome?
If the performance penalty is zero or close to zero, will that imply
that such a patch isn't useful?
Please also keep in mind that not everyone gets paid to work on the kernel,
so I have neither the time nor the various architectures to test on.
Then, as Masahiro already mentioned, the real solution is
probably to add ".balign" to the various inline assembler parts.
With this the alignment gets recorded in the sh_addralign field
in the object files, and the linker will correctly lay out the executable
even without this patch here.
And, this is exactly what you would get if C-initializers would have
been used instead of inline assembly.
But independend of the correct way to fix it, I think the linker
script ideally should mention all sections it expects with the right
alignments. Just for completeness.
I'll leave it up to you if you will apply this patch.
Currently it's not absolutely needed any longer, but let's think about
the pros/cons of this patch:
Pro:
- It can prevent unnecessary unaligned memory accesses on all arches.
- It provides the linker with correct alignment for the various sections.
- It mimics what you would get if the structs were coded with C-initializers.
- A correct section alignment can address upcoming inline assembly patches
which may have missed to add the .balign
Cons:
- For this specific patch the worst case scenario is that we add a total of
up to 7 bytes to kernel image (__ksymtab gets aligned to 8 bytes and the
following sections are then already aligned).
So, honestly I don't see a real reason why it shouldn't be applied...
>> Signed-off-by: Helge Deller <deller@....de>
>> Cc: <stable@...r.kernel.org> # v6.0+
>
> That's a stretch without any data, don't you think?
Yes. No need to push such a patch to stable series.
Helge
Powered by blists - more mailing lists