linux-kernel - [PATCH v2 0/2] ACPI: APEI: handle synchronous exceptions with proper si

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20230227050315.5670-1-xueshuai@linux.alibaba.com>
Date:   Mon, 27 Feb 2023 13:03:13 +0800
From:   Shuai Xue <xueshuai@...ux.alibaba.com>
To:     rafael@...nel.org, lenb@...nel.org, james.morse@....com,
        tony.luck@...el.com, bp@...en8.de, dave.hansen@...ux.intel.com,
        jarkko@...nel.org, naoya.horiguchi@....com, linmiaohe@...wei.com,
        akpm@...ux-foundation.org
Cc:     xiexiuqi@...wei.com, lvying6@...wei.com,
        linux-acpi@...r.kernel.org, linux-kernel@...r.kernel.org,
        cuibixuan@...ux.alibaba.com, baolin.wang@...ux.alibaba.com,
        zhuo.song@...ux.alibaba.com, xueshuai@...ux.alibaba.com
Subject: [PATCH v2 0/2] ACPI: APEI: handle synchronous exceptions with proper si_code

changes since v1:
- synchronous events by notify type
- Link: https://lore.kernel.org/lkml/20221206153354.92394-3-xueshuai@linux.alibaba.com/

Currently, both synchronous and asynchronous error are queued and handled
by a dedicated kthread in workqueue. And Memory failure for synchronous
error is synced by a cancel_work_sync trick which ensures that the
corrupted page is unmapped and poisoned. And after returning to user-space,
the task starts at current instruction which triggering a page fault in
which kernel will send SIGBUS to current process due to VM_FAULT_HWPOISON.

However, the memory failure recovery for hwpoison-aware mechanisms does not
work as expected. For example, hwpoison-aware user-space processes like
QEMU register their customized SIGBUS handler and enable early kill mode by
seting PF_MCE_EARLY at initialization. Then the kernel will directy notify
the process by sending a SIGBUS signal in memory failure with wrong
si_code: BUS_MCEERR_AO si_code to the actual user-space process instead of
BUS_MCEERR_AR.

To address this problem:

- PATCH 1 sets mf_flags as MF_ACTION_REQUIRED on synchronous events which
  indicates error happened in current execution context
- PATCH 2 separates synchronous error handling into task work so that the
  current context in memory failure is exactly belongs to the task
  consuming poison data.

Then, kernel will send SIGBUS with proper si_code in kill_proc().

Lv Ying and XiuQi also proposed to address similar problem and we discussed
about new solution to add a new flag(acpi_hest_generic_data::flags bit 8) to
distinguish synchronous event. [2][3] The UEFI community still has no response.
After a deep dive into the SDEI TRM, the SDEI notification should be used for
asynchronous error. As SDEI TRM[1] describes "the dispatcher can simulate an
exception-like entry into the client, **with the client providing an additional
asynchronous entry point similar to an interrupt entry point**". The client
(kernel) lacks complete synchronous context, e.g. systeam register (ELR, ESR,
etc). So notify type is enough to distinguish synchronous event.

[1] https://developer.arm.com/documentation/den0054/latest/
[2] https://lore.kernel.org/linux-arm-kernel/20221205160043.57465-4-xiexiuqi@huawei.com/T/
[3] https://lore.kernel.org/lkml/20221209095407.383211-1-lvying6@huawei.com/

Shuai Xue (2):
  ACPI: APEI: set memory failure flags as MF_ACTION_REQUIRED on
    synchronous events
  ACPI: APEI: handle synchronous exceptions in task work

 drivers/acpi/apei/ghes.c | 134 ++++++++++++++++++++++++---------------
 include/acpi/ghes.h      |   3 -
 mm/memory-failure.c      |  13 ----
 3 files changed, 82 insertions(+), 68 deletions(-)

-- 
2.20.1.12.g72788fdb