lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AD348832-5A6A-48F1-9735-924F144330F7@nvidia.com>
Date: Wed, 19 Feb 2025 11:10:04 -0500
From: Zi Yan <ziy@...dia.com>
To: Baolin Wang <baolin.wang@...ux.alibaba.com>
Cc: Matthew Wilcox <willy@...radead.org>, linux-mm@...ck.org,
 linux-fsdevel@...r.kernel.org, Andrew Morton <akpm@...ux-foundation.org>,
 Hugh Dickins <hughd@...gle.com>, Kairui Song <kasong@...cent.com>,
 Miaohe Lin <linmiaohe@...wei.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/2] mm/shmem: use xas_try_split() in
 shmem_split_large_entry()

On 19 Feb 2025, at 5:04, Baolin Wang wrote:

> Hi Zi,
>
> Sorry for the late reply due to being busy with other things:)

Thank you for taking a look at the patches. :)

>
> On 2025/2/19 07:54, Zi Yan wrote:
>> During shmem_split_large_entry(), large swap entries are covering n slots
>> and an order-0 folio needs to be inserted.
>>
>> Instead of splitting all n slots, only the 1 slot covered by the folio
>> need to be split and the remaining n-1 shadow entries can be retained with
>> orders ranging from 0 to n-1.  This method only requires
>> (n/XA_CHUNK_SHIFT) new xa_nodes instead of (n % XA_CHUNK_SHIFT) *
>> (n/XA_CHUNK_SHIFT) new xa_nodes, compared to the original
>> xas_split_alloc() + xas_split() one.
>>
>> For example, to split an order-9 large swap entry (assuming XA_CHUNK_SHIFT
>> is 6), 1 xa_node is needed instead of 8.
>>
>> xas_try_split_min_order() is used to reduce the number of calls to
>> xas_try_split() during split.
>
> For shmem swapin, if we cannot swap in the whole large folio by skipping the swap cache, we will split the large swap entry stored in the shmem mapping into order-0 swap entries, rather than splitting it into other orders of swap entries. This is because the next time we swap in a shmem folio through shmem_swapin_cluster(), it will still be an order 0 folio.

Right. But the swapin is one folio at a time, right? shmem_split_large_entry()
should split the large swap entry and give you a slot to store the order-0 folio.
For example, with an order-9 large swap entry, to swap in first order-0 folio,
the large swap entry will become order-0, order-0, order-1, order-2,… order-8,
after the split. Then the first order-0 swap entry can be used.
Then, when a second order-0 is swapped in, the second order-0 can be used.
When the last order-0 is swapped in, the order-8 would be split to
order-7,order-6,…,order-1,order-0, order-0, and the last order-0 will be used.

Maybe the swapin assumes after shmem_split_large_entry(), all swap entries
are order-0, which can lead to issues. There should be some check like
if the swap entry order > folio_order, shmem_split_large_entry() should
be used.
>
> Moreover I did a quick test with swapping in order 6 shmem folios, however, my test hung, and the console was continuously filled with the following information. It seems there are some issues with shmem swapin handling. Anyway, I need more time to debug and test.
To swap in order-6 folios, shmem_split_large_entry() does not allocate
any new xa_node, since XA_CHUNK_SHIFT is 6. It is weird to see OOM
error below. Let me know if there is anything I can help.

>
> [ 1037.364644] Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
> [ 1037.364650] Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
> [ 1037.364652] Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
> [ 1037.364654] Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
> [ 1037.364656] Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
> [ 1037.364658] Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
> [ 1037.364659] Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
> [ 1037.364661] Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
> [ 1037.364663] Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
> [ 1037.364665] Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
> [ 1042.368539] pagefault_out_of_memory: 9268696 callbacks suppressed
> .......


Best Regards,
Yan, Zi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ