[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87d078tjl0.fsf_-_@kamboji.qca.qualcomm.com>
Date: Wed, 13 May 2020 09:50:03 +0300
From: Kalle Valo <kvalo@...eaurora.org>
To: Arnd Bergmann <arnd@...db.de>
Cc: linux-wireless@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: gcc-10: kernel stack is corrupted and fails to boot
(trimming CC, changing title)
Kalle Valo <kvalo@...eaurora.org> writes:
> Kalle Valo <kvalo@...eaurora.org> writes:
>
>> Arnd Bergmann <arnd@...db.de> writes:
>>
>>> gcc-10 correctly points out a bug with a zero-length array in
>>> struct ath10k_pci:
>>>
>>> drivers/net/wireless/ath/ath10k/ahb.c: In function 'ath10k_ahb_remove':
>>> drivers/net/wireless/ath/ath10k/ahb.c:30:9: error: array subscript 0
>>> is outside the bounds of an interior zero-length array 'struct
>>> ath10k_ahb[0]' [-Werror=zero-length-bounds]
>>> 30 | return &((struct ath10k_pci *)ar->drv_priv)->ahb[0];
>>> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> In file included from drivers/net/wireless/ath/ath10k/ahb.c:13:
>>> drivers/net/wireless/ath/ath10k/pci.h:185:20: note: while referencing 'ahb'
>>> 185 | struct ath10k_ahb ahb[0];
>>> | ^~~
>>>
>>> The last addition to the struct ignored the comments and added
>>> new members behind the array that must remain last.
>>>
>>> Change it to a flexible-array member and move it last again to
>>> make it work correctly, prevent the same thing from happening
>>> again (all compilers warn about flexible-array members in the
>>> middle of a struct) and get it to build without warnings.
>>
>> Very good find, thanks! This bug would cause all sort of strange memory
>> corruption issues.
>
> This motivated me to switch to using GCC 10.x and I noticed that you had
> already upgraded crosstool so it was a trivial thing to do, awesome :)
>
> https://mirrors.edge.kernel.org/pub/tools/crosstool/
And now I have a problem :) I first noticed that my x86 testbox is not
booting when I compile the kernel with GCC 10.1.0 from crosstool. I
didn't get any error messages so I just downgraded the compiler and the
kernel was booting fine again. Next I decided to try GCC 10.1 with my
x86 laptop and it also failed to boot, but this time I got kernel logs
and saw this:
Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_secodary+0x178/0x180
Call Trace:
dump_stack
panic
? _raw_spin_unlock_irqrestore
? start_secondary
__stack_chk_fail
start_secondary
secondary_startup
(I wrote the above messages manually from a picture so expect typos)
Then also on my x86 laptop I downgraded the compiler to GCC 8.1.0 (from
crosstool), rebuilt the exactly same kernel version and the kernel
booted without issues.
I'm using 5.7.0-rc4-wt-ath+ which is basically v5.7-rc4 plus latest
wireless patches, and I doubt the wireless patches are making any
difference this early in the boot. All compilers I use are prebuilt
binaries from kernel.org crosstool repo[1] with addition of ccache
v3.4.1 to speed up my builds.
Any ideas? How should I debug this further?
[1] https://mirrors.edge.kernel.org/pub/tools/crosstool/
--
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
Powered by blists - more mailing lists