linux-kernel - Re: [PATCH 1/3] mm/memory-failure: try to send SIGBUS even if unmap failed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2f965887-19b5-47bf-98ca-d40b3ec05e75@oracle.com>
Date: Tue, 7 May 2024 10:54:10 -0700
From: Jane Chu <jane.chu@...cle.com>
To: Oscar Salvador <osalvador@...e.de>
Cc: linmiaohe@...wei.com, nao.horiguchi@...il.com, akpm@...ux-foundation.org,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/3] mm/memory-failure: try to send SIGBUS even if unmap
 failed

On 5/7/2024 2:02 AM, Oscar Salvador wrote:

> On Wed, May 01, 2024 at 05:24:56PM -0600, Jane Chu wrote:
>> For years when it comes down to kill a process due to hwpoison,
>> a SIGBUS is delivered only if unmap has been successful.
>> Otherwise, a SIGKILL is delivered. And the reason for that is
>> to prevent the involved process from accessing the hwpoisoned
>> page again.
>>
>> Since then a lot has changed, a hwpoisoned page is marked and
>> upon being re-accessed, the process will be killed immediately.
>> So let's take out the '!unmap_success' factor and try to deliver
>> SIGBUS if possible.
> I am missing some details here.
> An unmapped hwpoison page will trigger a fault and will return
> VM_FAULT_HWPOISON all the way down and then deliver SIGBUS,
> but if the page was not unmapped, how will this be catch upon
> re-accessing? Will the system deliver a MCE event?
>
I actually managed to hit the re-access case with an older version of 
Linux -

MCE occurred, but unmap failed,  no SIGBUS and test process re-access

the same address over and over (hence MCE after MCE), as the CPU

was unable to make forward progress.   In reality, this issue is fixed with

kill_accessing_processes().  The comment for this patch refers to 
comment made

about '!unmap_access' long time ago.

thanks,

-jane