[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <81830a33-5539-378e-af03-f681dbc4656c@leemhuis.info>
Date: Sat, 16 Apr 2022 06:41:40 +0200
From: Thorsten Leemhuis <regressions@...mhuis.info>
To: "regressions@...ts.linux.dev" <regressions@...ts.linux.dev>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Bug 215720 - brk() regression on AArch64 on static-pie binary --
issue with ASLR and a guard page? #forregzbot
TWIMC: this mail is primarily send for documentation purposes and for
regzbot, my Linux kernel regression tracking bot. These mails usually
contain '#forregzbot' in the subject, to make them easy to spot and filter.
#regzbot fixed-by: aeb7923733d100
On 09.04.22 13:49, Thorsten Leemhuis wrote:
> Hi, this is your Linux kernel regression tracker. Top-posting for once,
> to make this easily accessible to everyone.
>
> Hey, what's up here? Or was this regressions fixed already?
>
> H.J. Lu: reminder, this is caused by a patch of yours.
>
> Mike, if you have a minute: '925346c129da' ("fs/binfmt_elf: fix PT_LOAD
> p_align values for loaders") in 'next' contains a 'Fixes:' tag for the
> culprit of this regression, but I assume it fixes a different issue?
>
> Ciao, Thorsten
>
> #regzbot poke
>
> On 28.03.22 15:21, Thorsten Leemhuis wrote:
>> Hi, this is your Linux kernel regression tracker.
>>
>> I noticed a regression report in bugzilla.kernel.org that afaics nobody
>> acted upon since it was reported about a week ago, that's why I decided
>> to forward it to the lists and the author of the culprit. To quote from
>> https://bugzilla.kernel.org/show_bug.cgi?id=215720:
>>
>>> Victor Stinner 2022-03-22 02:24:57 UTC
>>>
>>> Created attachment 300597 [details]
>>> empty.c reproducer
>>>
>>> I found a brk() syscall regression of Linux kernel 5.17 on AArch64.
>>>
>>> A git bisect found the change "fs/binfmt_elf: use PT_LOAD p_align values for static PIE": commit 9630f0d60fec5fbcaa4435a66f75df1dc9704b66, changed related to the bz#215275.
>>>
>>> Program to reproduce the bug, empty.c (attached to the issue):
>>> ---
>>> _Thread_local int var1 = 0;
>>> int main() {
>>> volatile int x = 1;
>>> var1 = x;
>>> return 0;
>>> }
>>> ---
>>>
>>> Build the program as a static PIE program:
>>>
>>> gcc -std=c11 -static-pie -g empty.c -o empty -O2
>>>
>>> The program fails randomly, it takes 100 to 6000 runs to reproduce the crash.
>>>
>>> Short shell loop to reproduce the crash:
>>> ---
>>> $ i=0; while true; do ./empty; rc=$?; i=$(($i + 1)); echo "$i:
>>> $(date): $rc"; if [ $rc -ne 0 ]; then break; fi; done
>>> (...)
>>> 159: Tue Mar 22 01:54:22 CET 2022: 0
>>> 160: Tue Mar 22 01:54:22 CET 2022: 0
>>> Segmentation fault (core dumped)
>>> 161: Tue Mar 22 01:54:22 CET 2022: 139
>>> ---
>>>
>>> Disabling ASLR (write 0 to /proc/sys/kernel/randomize_va_space) works
>>> around the bug.
>>>
>>> Rather than using "empty.c" program, the "ldconfig -V > /dev/null" command can be used: standard static-pie program.
>>>
>>> strace when the program works:
>>> ---
>>> brk(NULL) = 0xaaaac3961000
>>> brk(0xaaaac3961b78) = 0xaaaac3961b78
>>> ---
>>>
>>> strace when the bug occurs:
>>> ---
>>> brk(NULL) = 0xaaaabf3c3000
>>> brk(0xaaaabf3c3b78) = 0xaaaabf3c3000
>>> ---
>>>
>>> The following test of the brk() syscall fails when the bug occurs:
>>> ---
>>> /* Check against existing mmap mappings. */
>>> next = find_vma(mm, oldbrk);
>>> if (next && newbrk + PAGE_SIZE > vm_start_gap(next))
>>> goto out;
>>> ---
>>>
>>> Note: When the bug occurs, the program crash with SIGSEGV: the glibc __libc_setup_tls() function calls sbrk(2936) to allocate TLS variables, but it doesn't handle the memory allocation failure.
>>>
>>> Note: At the beginning, I discovered this kernel regression while checking for Python
>>> buildbot failures on our Fedora Rawhide AArch64 machine.
>>>
>>> * Fedora downstream issue: https://bugzilla.redhat.com/show_bug.cgi?id=2066147
>>> * Python issue: https://bugs.python.org/issue47078
>>>
>>> [reply] [−] Comment 1 Victor Stinner 2022-03-22 02:41:00 UTC
>>>
>>> See also the binutils issue: "p_align in ELF program headers should not exceed section alignment"
>>> https://sourceware.org/bugzilla/show_bug.cgi?id=28689
>>>
>>> See also this old (kernel 4.18) fixed x86-64 kernel bug: "kernel: brk can grow the heap into the area reserved for the stack"
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1749633
>>
>>
>> Could somebody take a look into this? Or was this discussed somewhere
>> else already? Or even fixed?
>>
>> Anyway, to get this tracked:
>>
>> #regzbot introduced: 9630f0d60fec5fbcaa4435a66f75df1dc9704b66
>> #regzbot from: Victor Stinner <vstinner@...hat.com>
>> #regzbot title: brk() regression on AArch64 on static-pie binary
>> #regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=215720
>>
>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>>
>> P.S.: As the Linux kernel's regression tracker I'm getting a lot of
>> reports on my table. I can only look briefly into most of them and lack
>> knowledge about most of the areas they concern. I thus unfortunately
>> will sometimes get things wrong or miss something important. I hope
>> that's not the case here; if you think it is, don't hesitate to tell me
>> in a public reply, it's in everyone's interest to set the public record
>> straight.
>>
Powered by blists - more mailing lists