[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4b52e6cd-1315-4b0b-8b6e-95a3d4ed96cc@linux.alibaba.com>
Date: Tue, 18 Feb 2025 21:53:17 +0800
From: Shuai Xue <xueshuai@...ux.alibaba.com>
To: Borislav Petkov <bp@...en8.de>
Cc: tony.luck@...el.com, nao.horiguchi@...il.com, tglx@...utronix.de,
mingo@...hat.com, dave.hansen@...ux.intel.com, x86@...nel.org,
hpa@...or.com, linmiaohe@...wei.com, akpm@...ux-foundation.org,
peterz@...radead.org, jpoimboe@...nel.org, linux-edac@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
baolin.wang@...ux.alibaba.com, tianruidong@...ux.alibaba.com
Subject: Re: [PATCH v2 0/5] mm/hwpoison: Fix regressions in memory failure
handling
在 2025/2/18 21:17, Borislav Petkov 写道:
> On Tue, Feb 18, 2025 at 09:08:25PM +0800, Shuai Xue wrote:
>> Yes, the poison is found on user pages.
>>
>> Form commit log, the mechanism is added by Tony and suggested by you.
>> https://lkml.kernel.org/r/20210818002942.1607544-3-tony.luck@intel.com
>
> I'm not talking about how it is detected - I'm asking about *what* you're
> doing exactly. I want to figure out what and why you're doing what you're
> doing.
>
>> It's the same as with real issue. There's no magic to it.
>
> Magic or not, doesn't matter. The only question is whether this can happen in
> real life and it is not just you using some tools and "fixing" things that
> ain't broke.
The regression is reported by end user and we also observed in the production.
[5056863.064239] task: ffff8837d2a2a0c0 task.stack: ffffc90065814000
[5056863.137299] RIP: 0010:[<ffffffff813ad231>] [<ffffffff813ad231>] __get_user_8+0x21/0x2b
...
[5056864.512018] Call Trace:
[5056864.543440] [<ffffffff8111c203>] ? exit_robust_list+0x33/0x110
[5056864.616456] [<ffffffff81088399>] mm_release+0x109/0x140
[5056864.682178] [<ffffffff8108faf9>] do_exit+0x159/0xb60
[5056864.744785] [<ffffffff81090583>] do_group_exit+0x43/0xb0
[5056864.811551] [<ffffffff8109bdc9>] get_signal+0x289/0x630
[5056864.877277] [<ffffffff8102d227>] do_signal+0x37/0x690
[5056864.940925] [<ffffffff8111c4e5>] ? do_futex+0x205/0x520
[5056865.006651] [<ffffffff8111c885>] ? SyS_futex+0x85/0x170
[5056865.072378] [<ffffffff81003726>] exit_to_usermode_loop+0x76/0xc0
[5056865.147464] [<ffffffff81003d01>] do_syscall_64+0x171/0x180
[5056865.216311] [<ffffffff81741c8e>] entry_SYSCALL_64_after_swapgs+0x58/0xc6
>
>>> What do futexes have to do with copying user memory?
>>
>> Return -EFAULT to userspace.
>
> This doesn't even begin to answer my question so I'll ask again:
>
> "What do futexes have to do with copying user memory?"
>
Sorry, I did not get your point.
Thanks.
Shuai
Powered by blists - more mailing lists