lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <134e43f7-583c-48c1-8ccc-dddc18700d3b@linux.alibaba.com>
Date: Fri, 24 Oct 2025 18:03:22 +0800
From: Shuai Xue <xueshuai@...ux.alibaba.com>
To: Ira Weiny <ira.weiny@...el.com>, "Luck, Tony" <tony.luck@...el.com>,
 "ankita@...dia.com" <ankita@...dia.com>,
 "aniketa@...dia.com" <aniketa@...dia.com>, "Sethi, Vikram"
 <vsethi@...dia.com>, "jgg@...dia.com" <jgg@...dia.com>,
 "mochs@...dia.com" <mochs@...dia.com>,
 "skolothumtho@...dia.com" <skolothumtho@...dia.com>,
 "linmiaohe@...wei.com" <linmiaohe@...wei.com>,
 "nao.horiguchi@...il.com" <nao.horiguchi@...il.com>,
 "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
 "david@...hat.com" <david@...hat.com>,
 "lorenzo.stoakes@...cle.com" <lorenzo.stoakes@...cle.com>,
 "Liam.Howlett@...cle.com" <Liam.Howlett@...cle.com>,
 "vbabka@...e.cz" <vbabka@...e.cz>, "rppt@...nel.org" <rppt@...nel.org>,
 "surenb@...gle.com" <surenb@...gle.com>, "mhocko@...e.com"
 <mhocko@...e.com>, "bp@...en8.de" <bp@...en8.de>,
 "rafael@...nel.org" <rafael@...nel.org>,
 "guohanjun@...wei.com" <guohanjun@...wei.com>,
 "mchehab@...nel.org" <mchehab@...nel.org>, "lenb@...nel.org"
 <lenb@...nel.org>, "Tian, Kevin" <kevin.tian@...el.com>,
 "alex@...zbot.org" <alex@...zbot.org>
Cc: "cjia@...dia.com" <cjia@...dia.com>,
 "kwankhede@...dia.com" <kwankhede@...dia.com>,
 "targupta@...dia.com" <targupta@...dia.com>,
 "zhiw@...dia.com" <zhiw@...dia.com>, "dnigam@...dia.com"
 <dnigam@...dia.com>, "kjaju@...dia.com" <kjaju@...dia.com>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 "linux-mm@...ck.org" <linux-mm@...ck.org>,
 "linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
 "Jonathan.Cameron@...wei.com" <Jonathan.Cameron@...wei.com>,
 "Smita.KoralahalliChannabasappa@....com"
 <Smita.KoralahalliChannabasappa@....com>,
 "u.kleine-koenig@...libre.com" <u.kleine-koenig@...libre.com>,
 "peterz@...radead.org" <peterz@...radead.org>,
 "linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
 "kvm@...r.kernel.org" <kvm@...r.kernel.org>
Subject: Re: [PATCH v3 2/3] mm: Change ghes code to allow poison of non-struct
 pfn



在 2025/10/22 23:03, Ira Weiny 写道:
> Shuai Xue wrote:
>>
>>
>> 在 2025/10/22 01:19, Luck, Tony 写道:
>>>>>       pfn = PHYS_PFN(physical_addr);
>>>>> -   if (!pfn_valid(pfn) && !arch_is_platform_page(physical_addr)) {
>>>>
>>>> Tony,
>>>>
>>>> I'm not an SGX expert but does this break SGX by removing
>>>> arch_is_platform_page()?
>>>>
>>>> See:
>>>>
>>>> 40e0e7843e23 ("x86/sgx: Add infrastructure to identify SGX EPC pages")
>>>> Cc: Tony Luck <tony.luck@...el.com>
>>>>
>>> Ira,
>>>
>>> I think this deletion makes the GHES code always call memory_failure()
>>> instead of bailing out here on "bad" page frame numbers.
>>>
>>> That centralizes the checks for different types of memory into
>>> memory_failure().
>>>
>>> -Tony
>>
>> Hi, Tony, Ankit and Ira,
>>
>> Finally, we're seeing other use cases that need to handle errors for
>> non-struct page PFNs :)
>>
>> IMHO, non-struct page PFNs are common in production environments.
>> Besides NVIDIA Grace GPU device memory, we also use reserved DRAM memory
>> managed by a separate VMEM allocator.
> 
> Can you elaborate on this more?

We reserve a significant portion of DRAM memory at boot time using
kernel command line parameters. This reserved memory is then managed by
our internal VMEM allocator, which handles memory allocation and
deallocation for virtual machines.

To minimize memory overhead, we intentionally avoid creating struct
pages for this reserved memory region. Instead, we've implemented the
following approach:

- Our VMEM allocator directly manages the physical memory without the
   overhead of struct page metadata.
- Error Handling: We register custom RAS operations (ras_ops) with the
   memory failure infrastructure. When poisoned memory is accessed within
   this region, our registered handler: Tags the affected memory area as
   poisoned Isolates the memory to prevent further access Terminates any
   tasks that were using the poisoned memory

This approach allows us to handle memory errors effectively while
maintaining minimal memory overhead for large reserved regions. It's
similar in concept to how device memory (like NVIDIA Grace GPU memory
mentioned earlier) needs error handling without struct page backing.

Thanks.
Shuai

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ