lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87d078tjl0.fsf_-_@kamboji.qca.qualcomm.com>
Date:   Wed, 13 May 2020 09:50:03 +0300
From:   Kalle Valo <kvalo@...eaurora.org>
To:     Arnd Bergmann <arnd@...db.de>
Cc:     linux-wireless@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: gcc-10: kernel stack is corrupted and fails to boot

(trimming CC, changing title)

Kalle Valo <kvalo@...eaurora.org> writes:

> Kalle Valo <kvalo@...eaurora.org> writes:
>
>> Arnd Bergmann <arnd@...db.de> writes:
>>
>>> gcc-10 correctly points out a bug with a zero-length array in
>>> struct ath10k_pci:
>>>
>>> drivers/net/wireless/ath/ath10k/ahb.c: In function 'ath10k_ahb_remove':
>>> drivers/net/wireless/ath/ath10k/ahb.c:30:9: error: array subscript 0
>>> is outside the bounds of an interior zero-length array 'struct
>>> ath10k_ahb[0]' [-Werror=zero-length-bounds]
>>>    30 |  return &((struct ath10k_pci *)ar->drv_priv)->ahb[0];
>>>       |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> In file included from drivers/net/wireless/ath/ath10k/ahb.c:13:
>>> drivers/net/wireless/ath/ath10k/pci.h:185:20: note: while referencing 'ahb'
>>>   185 |  struct ath10k_ahb ahb[0];
>>>       |                    ^~~
>>>
>>> The last addition to the struct ignored the comments and added
>>> new members behind the array that must remain last.
>>>
>>> Change it to a flexible-array member and move it last again to
>>> make it work correctly, prevent the same thing from happening
>>> again (all compilers warn about flexible-array members in the
>>> middle of a struct) and get it to build without warnings.
>>
>> Very good find, thanks! This bug would cause all sort of strange memory
>> corruption issues.
>
> This motivated me to switch to using GCC 10.x and I noticed that you had
> already upgraded crosstool so it was a trivial thing to do, awesome :)
>
> https://mirrors.edge.kernel.org/pub/tools/crosstool/

And now I have a problem :) I first noticed that my x86 testbox is not
booting when I compile the kernel with GCC 10.1.0 from crosstool. I
didn't get any error messages so I just downgraded the compiler and the
kernel was booting fine again. Next I decided to try GCC 10.1 with my
x86 laptop and it also failed to boot, but this time I got kernel logs
and saw this:

Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_secodary+0x178/0x180

Call Trace:
dump_stack
panic
? _raw_spin_unlock_irqrestore
? start_secondary
__stack_chk_fail
start_secondary
secondary_startup

(I wrote the above messages manually from a picture so expect typos)

Then also on my x86 laptop I downgraded the compiler to GCC 8.1.0 (from
crosstool), rebuilt the exactly same kernel version and the kernel
booted without issues.

I'm using 5.7.0-rc4-wt-ath+ which is basically v5.7-rc4 plus latest
wireless patches, and I doubt the wireless patches are making any
difference this early in the boot. All compilers I use are prebuilt
binaries from kernel.org crosstool repo[1] with addition of ccache
v3.4.1 to speed up my builds.

Any ideas? How should I debug this further?

[1] https://mirrors.edge.kernel.org/pub/tools/crosstool/

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ