lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e453273b-42ff-92df-c659-15ebe4c6ef33@huaweicloud.com>
Date: Thu, 19 Jun 2025 09:30:22 +0800
From: Kemeng Shi <shikemeng@...weicloud.com>
To: Kairui Song <ryncsn@...il.com>
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
 Hugh Dickins <hughd@...gle.com>, Baolin Wang
 <baolin.wang@...ux.alibaba.com>, Matthew Wilcox <willy@...radead.org>,
 Chris Li <chrisl@...nel.org>, Nhat Pham <nphamcs@...il.com>,
 Baoquan He <bhe@...hat.com>, Barry Song <baohua@...nel.org>,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/4] mm/shmem, swap: avoid redundant Xarray lookup during
 swapin



on 6/18/2025 11:07 AM, Kairui Song wrote:
> On Wed, Jun 18, 2025 at 10:49 AM Kemeng Shi <shikemeng@...weicloud.com> wrote:
>> on 6/18/2025 2:35 AM, Kairui Song wrote:
>>> From: Kairui Song <kasong@...cent.com>
>>>
>>> Currently shmem calls xa_get_order to get the swap radix entry order,
>>> requiring a full tree walk. This can be easily combined with the swap
>>> entry value checking (shmem_confirm_swap) to avoid the duplicated
>>> lookup, which should improve the performance.
>>>
>>> Signed-off-by: Kairui Song <kasong@...cent.com>
>>> ---
>>>  mm/shmem.c | 33 ++++++++++++++++++++++++---------
>>>  1 file changed, 24 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/mm/shmem.c b/mm/shmem.c
>>> index 4e7ef343a29b..0ad49e57f736 100644
>>> --- a/mm/shmem.c
>>> +++ b/mm/shmem.c
>>> @@ -505,15 +505,27 @@ static int shmem_replace_entry(struct address_space *mapping,
>>>
>>>  /*
>>>   * Sometimes, before we decide whether to proceed or to fail, we must check
>>> - * that an entry was not already brought back from swap by a racing thread.
>>> + * that an entry was not already brought back or split by a racing thread.
>>>   *
>>>   * Checking folio is not enough: by the time a swapcache folio is locked, it
>>>   * might be reused, and again be swapcache, using the same swap as before.
>>> + * Returns the swap entry's order if it still presents, else returns -1.
>>>   */
>>> -static bool shmem_confirm_swap(struct address_space *mapping,
>>> -                            pgoff_t index, swp_entry_t swap)
>>> +static int shmem_swap_check_entry(struct address_space *mapping, pgoff_t index,
>>> +                               swp_entry_t swap)
>>>  {
>>> -     return xa_load(&mapping->i_pages, index) == swp_to_radix_entry(swap);
>>> +     XA_STATE(xas, &mapping->i_pages, index);
>>> +     int ret = -1;
>>> +     void *entry;
>>> +
>>> +     rcu_read_lock();
>>> +     do {
>>> +             entry = xas_load(&xas);
>>> +             if (entry == swp_to_radix_entry(swap))
>>> +                     ret = xas_get_order(&xas);
>>> +     } while (xas_retry(&xas, entry));
>>> +     rcu_read_unlock();
>>> +     return ret;
>>>  }
>>>
>>>  /*
>>> @@ -2256,16 +2268,20 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
>>>               return -EIO;
>>>
>>>       si = get_swap_device(swap);
>>> -     if (!si) {
>>> -             if (!shmem_confirm_swap(mapping, index, swap))
>>> +     order = shmem_swap_check_entry(mapping, index, swap);
>>> +     if (unlikely(!si)) {
>>> +             if (order < 0)
>>>                       return -EEXIST;
>>>               else
>>>                       return -EINVAL;
>>>       }
>>> +     if (unlikely(order < 0)) {
>>> +             put_swap_device(si);
>>> +             return -EEXIST;
>>> +     }
>> Can we re-arrange the code block as following:
>>         order = shmem_swap_check_entry(mapping, index, swap);
>>         if (unlikely(order < 0))
>>                 return -EEXIST;
>>
>>         si = get_swap_device(swap);
>>         if (!si) {
>>                 return -EINVAL;
>> ...
> 
> Hi, thanks for the suggestion.
> 
> This may lead to a trivial higher chance of getting -EINVAL when it
> should return -EEXIST, leading to user space errors.
> 
> For example if this CPU get interrupted after `order =
> shmem_swap_check_entry(mapping, index, swap);`, and another CPU
> swapoff-ed the device. Next, we get `si = NULL` here, but the entry is
> swapped in already, so it should return -EEXIST. Not -EINVAL.
> 
> The chance is really low so it's kind of trivial, we can do a `goto
> failed` if got (!si) here, but it will make the logic under `failed:`
> more complex. So I'd prefer to not change the original behaviour,
> which looks more correct.
> 
Right, thanks for explanation.

Reviewed-by: Kemeng Shi <shikemeng@...weicloud.com>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ