[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1ceb1769-f01b-c417-e912-4ee1149dd0cb@huawei.com>
Date: Wed, 12 Apr 2023 15:01:53 +0800
From: Chen Zhongjin <chenzhongjin@...wei.com>
To: Dave Hansen <dave.hansen@...el.com>, <x86@...nel.org>,
<linux-kernel@...r.kernel.org>
CC: <tglx@...utronix.de>, <mingo@...hat.com>, <bp@...en8.de>,
<dave.hansen@...ux.intel.com>, <hpa@...or.com>,
<akpm@...ux-foudation.org>, <ben-linux@...ff.org>,
<wuchi.zero@...il.com>
Subject: Re: [PATCH 1/2] x86: profiling: remove lock functions hack for
!FRAME_POINTER
On 2023/4/11 3:34, Dave Hansen wrote:
> On 4/9/23 19:22, Chen Zhongjin wrote:
>> Syzbot has been reporting the problem of stack-out-of-bounds in
>> profile_pc for a long time:
>> https://syzkaller.appspot.com/bug?extid=84fe685c02cd112a2ac3
>>
>> profile_pc tries to get pc if current regs is inside lock function. For
>> !CONFIG_FRAME_POINTER it used a hack way to get the pc from stack, which
>> is not work with ORC. It makes profile_pc read illeagal address, return
>> wrong result, and frequently triggers KASAN.
>>
>> Since lock profiling can be handled with much better other tools, It's
>> reasonable to remove lock functions hack for !FRAME_POINTER kernel.
> OK, so let me make sure I understand what's going on:
>
> 1. This whole issue is limited to kernel/profile.c which is what drives
> readprofile(8) and /proc/profile
> 2. This is removing code that got added in 2006:
> 0cb91a229364 ("[PATCH] i386: Account spinlocks to the caller during
> profiling for !FP kernels")
> 3. This was an OK hack back in the day, but it outright breaks today
> in some situations. KASAN also didn't exist in 2006.
Yes, and whether KASAN is enabled it can make problem.
Some lock_function will save registers on stack (this may not happen in
2006).
These registers can be recorded by profile and be read outside of kernel,
which is risky theoretically.
> 4. !CONFIG_FRAME_POINTER is probably even more rare today than it was in
> 2006
No. !CONFIG_FRAME_POINTER is more common today because of UNWINDER_ORC.
And that is why the bug is triggered more frequently.
> 5. Lock function caller information is available at _least_ from perf,
> maybe other places too?? (What "much better other tools" are there?)
Yes, it's basically about perf function graph.
> Given all that, this patch suggests that we can remove the stack peeking
> hack. The downside is that /proc/profile users will see their profiles
> pointing to the spinlock functions like they did in 2005. The upside is
> that we won't get any more KASAN reports.
>
> If anyone complains, I assume we're just going to tell them to run 'perf
> --call-graph' and to go away (which also probably didn't exist in 2006).
>
> If I got all that right, the end result seems sane to me. It would be
> _nice_ if you could make a more coherent changelog out of that and
> resend. Also, considering that your two "profile" issues are quite
> independent, you can probably just resend the two patches separately.
Thanks for review and I'll send another version to provide better details.
Powered by blists - more mailing lists