[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANDhNCraQ6UCDNH3s4+YKCWfk4dGjxP_LkZ7WBUnJ_WiKM5u6Q@mail.gmail.com>
Date: Thu, 18 Jul 2024 10:09:47 -0700
From: John Stultz <jstultz@...gle.com>
To: Masahiro Yamada <masahiroy@...nel.org>
Cc: patches@...linux.org.uk, linux-kernel@...r.kernel.org,
Russell King <linux@...linux.org.uk>, Ard Biesheuvel <ardb@...nel.org>,
linux-arm-kernel@...ts.infradead.org, stable@...r.kernel.org,
Neill Kapron <nkapron@...gle.com>
Subject: Re: [PATCH] ARM: fix get_user() broken with veneer
On Tue, Sep 26, 2023 at 9:09 AM Masahiro Yamada <masahiroy@...nel.org> wrote:
>
> The 32-bit ARM kernel stops working if the kernel grows to the point
> where veneers for __get_user_* are created.
>
> AAPCS32 [1] states, "Register r12 (IP) may be used by a linker as a
> scratch register between a routine and any subroutine it calls. It
> can also be used within a routine to hold intermediate values between
> subroutine calls."
>
> However, bl instructions buried within the inline asm are unpredictable
> for compilers; hence, "ip" must be added to the clobber list.
>
> This becomes critical when veneers for __get_user_* are created because
> veneers use the ip register since commit 02e541db0540 ("ARM: 8323/1:
> force linker to use PIC veneers").
>
> [1]: https://github.com/ARM-software/abi-aa/blob/2023Q1/aapcs32/aapcs32.rst
>
> Signed-off-by: Masahiro Yamada <masahiroy@...nel.org>
> Reviewed-by: Ard Biesheuvel <ardb@...nel.org>
+ stable@...r.kernel.org
It seems like this (commit 24d3ba0a7b44c1617c27f5045eecc4f34752ab03
upstream) would be a good candidate for -stable?
The issue it fixes can manifest in lots of very strange ways, so it
would be good to avoid others getting tripped up by it on -stable
branches.
(Apologies for being a bit verbose in the following, I've included a
lot of details and breadcrumbs so others might find this if they run
into the same issues.)
I was recently looking into an arm32 issue, and found getting a custom
built kernel consistently working in qemu-system-arm to bisect issues
in the range of 5.15-6.6 was a bit difficult, as I would hit a couple
different odd errors.
For 5.15 I was seeing systemd fail to start in a fairly opaque way:
starting systemd-udevd.service - Rule-based Manager for Device
Events and Files.
systemd-udevd.service: Main process exited, code=exited, status=1/FAILURE
systemd-udevd.service: Failed with result 'exit-code'.
Failed to start systemd-udevd.service - Rule-based Manager for
Device Events and Files.
But further looking through the logs I found:
systemd[1]: Failed to open netlink: Operation not permitted
Despite lots of digging to try to understand what was going wrong, the
one thing that worked was switching to CONFIG_CC_OPTIMIZE_FOR_SIZE
(which I only tried as I came across this old thread:
https://lists.yoctoproject.org/g/linux-yocto/message/8035 ), this
seemed very suspicious, but I didn't have a lot of time to dig
further.
That resolved things until ~6.1, where I started seeing crashes at init:
[ 16.982562] Run /init as init process
[ 16.989311] Failed to execute /init (error -22)
[ 16.990017] Run /sbin/init as init process
[ 16.994737] Starting init: /sbin/init exists but couldn't execute
it (error -22)
That I bisected that failure down to being supposedly caused by commit
5750121ae738 ("kbuild: list sub-directories in ./Kbuild")
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5750121ae7382ebac8d47ce6d68012d6cd1d7926
And searching around that commit luckily led me to this change, which
finally seems to resolve the different issues I saw for 6.6, 6.1 and
5.15!
Now, In my rush to get something booting with qemu, I started with the
debian config but disabled modules, and didn't put much time into
getting rid of config options or drivers I wouldn't need. So the
kernel is pretty large. So maybe not super common, but I definitely
wouldn't want others to have to go down this debugging rabbit hole.
thanks
-john
Powered by blists - more mailing lists