lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bf232555-3653-40c7-bbdc-a8fe58a93a9e@gmail.com>
Date: Thu, 5 Sep 2024 00:23:47 +0100
From: Usama Arif <usamaarif642@...il.com>
To: Barry Song <21cnbao@...il.com>
Cc: Yosry Ahmed <yosryahmed@...gle.com>,
 Andrew Morton <akpm@...ux-foundation.org>, Kairui Song <ryncsn@...il.com>,
 hanchuanhua@...o.com, linux-mm@...ck.org, baolin.wang@...ux.alibaba.com,
 chrisl@...nel.org, david@...hat.com, hannes@...xchg.org, hughd@...gle.com,
 kaleshsingh@...gle.com, linux-kernel@...r.kernel.org, mhocko@...e.com,
 minchan@...nel.org, nphamcs@...il.com, ryan.roberts@....com,
 senozhatsky@...omium.org, shakeel.butt@...ux.dev, shy828301@...il.com,
 surenb@...gle.com, v-songbaohua@...o.com, willy@...radead.org,
 xiang@...nel.org, ying.huang@...el.com, hch@...radead.org
Subject: Re: [PATCH v7 2/2] mm: support large folios swap-in for sync io
 devices



On 05/09/2024 00:10, Barry Song wrote:
> On Thu, Sep 5, 2024 at 9:30 AM Usama Arif <usamaarif642@...il.com> wrote:
>>
>>
>>
>> On 03/09/2024 23:05, Yosry Ahmed wrote:
>>> On Tue, Sep 3, 2024 at 2:36 PM Barry Song <21cnbao@...il.com> wrote:
>>>>
>>>> On Wed, Sep 4, 2024 at 8:08 AM Andrew Morton <akpm@...ux-foundation.org> wrote:
>>>>>
>>>>> On Tue, 3 Sep 2024 11:38:37 -0700 Yosry Ahmed <yosryahmed@...gle.com> wrote:
>>>>>
>>>>>>> [   39.157954] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000007
>>>>>>> [   39.158288] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
>>>>>>> [   39.158634] R13: 0000000000002b9a R14: 0000000000000000 R15: 00007ffd619d5518
>>>>>>> [   39.158998]  </TASK>
>>>>>>> [   39.159226] ---[ end trace 0000000000000000 ]---
>>>>>>>
>>>>>>> After reverting this or Usama's "mm: store zero pages to be swapped
>>>>>>> out in a bitmap", the problem is gone. I think these two patches may
>>>>>>> have some conflict that needs to be resolved.
>>>>>>
>>>>>> Yup. I saw this conflict coming and specifically asked for this
>>>>>> warning to be added in Usama's patch to catch it [1]. It served its
>>>>>> purpose.
>>>>>>
>>>>>> Usama's patch does not handle large folio swapin, because at the time
>>>>>> it was written we didn't have it. We expected Usama's series to land
>>>>>> sooner than this one, so the warning was to make sure that this series
>>>>>> handles large folio swapin in the zeromap code. Now that they are both
>>>>>> in mm-unstable, we are gonna have to figure this out.
>>>>>>
>>>>>> I suspect Usama's patches are closer to land so it's better to handle
>>>>>> this in this series, but I will leave it up to Usama and
>>>>>> Chuanhua/Barry to figure this out :)
>>>>
>>>> I believe handling this in swap-in might violate layer separation.
>>>> `swap_read_folio()` should be a reliable API to call, regardless of
>>>> whether `zeromap` is present. Therefore, the fix should likely be
>>>> within `zeromap` but not this `swap-in`. I’ll take a look at this with
>>>> Usama :-)
>>>
>>> I meant handling it within this series to avoid blocking Usama
>>> patches, not within this code. Thanks for taking a look, I am sure you
>>> and Usama will figure out the best way forward :)
>>
>> Hi Barry and Yosry,
>>
>> Is the best (and quickest) way forward to have a v8 of this with
>> https://lore.kernel.org/all/20240904055522.2376-1-21cnbao@gmail.com/
>> as the first patch, and using swap_zeromap_entries_count in alloc_swap_folio
>> in this support large folios swap-in patch?
> 
> Yes, Usama. i can actually do a check:
> 
> zeromap_cnt = swap_zeromap_entries_count(entry, nr);
> 
> /* swap_read_folio() can handle inconsistent zeromap in multiple entries */
> if (zeromap_cnt > 0 && zeromap_cnt < nr)
>        try next order;
> 
> On the other hand, if you read the code of zRAM, you will find zRAM has
> exactly the same mechanism as zeromap but zRAM can even do more
> by same_pages filled. since zRAM does the job in swapfile layer, there
> is no this kind of consistency issue like zeromap.
> 
> So I feel for zRAM case, we don't need zeromap at all as there are duplicated
> efforts while I really appreciate your job which can benefit all swapfiles.
> i mean, zRAM has the ability to check "zero"(and also non-zero but same
> content). after zeromap checks zeromap, zRAM will check again:
> 

Yes, so there is a reason for having the zeromap patches, which I have outlined
in the coverletter.

https://lore.kernel.org/all/20240627105730.3110705-1-usamaarif642@gmail.com/

There are usecases where zswap/zram might not be used in production.
We can reduce I/O and flash wear in those cases by a large amount.

Also running in Meta production, we found that the number of non-zero filled
complete pages were less than 1%, so essentially its only the zero-filled pages
that matter.

I believe after zeromap, it might be a good idea to remove the page_same_filled 
check from zram code? Its not really a problem if its kept as well as I dont
believe any zero-filled pages should reach zram_write_page?

> static int zram_write_page(struct zram *zram, struct page *page, u32 index)
> {
>        ...
> 
>         if (page_same_filled(mem, &element)) {
>                 kunmap_local(mem);
>                 /* Free memory associated with this sector now. */
>                 flags = ZRAM_SAME;
>                 atomic64_inc(&zram->stats.same_pages);
>                 goto out;
>         }
>         ...
> }
> 
> So it seems that zeromap might slightly impact my zRAM use case. I'm not
> blaming you, just pointing out that there might be some overlap in effort
> here :-)
> 
>>
>> Thanks,
>> Usama
> 
> Thanks
> Barry


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ