linux-kernel - Re: [PATCH v4] mm: shmem: implement POSIX_FADV

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <cc194110-2467-47d3-9868-89e092a542e9@quicinc.com>
Date:   Mon, 7 Mar 2022 20:20:27 +0530
From:   Charan Teja Kalla <quic_charante@...cinc.com>
To:     Suren Baghdasaryan <surenb@...gle.com>
CC:     Hugh Dickins <hughd@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Matthew Wilcox <willy@...radead.org>,
        <markhemm@...glemail.com>, Vlastimil Babka <vbabka@...e.cz>,
        David Rientjes <rientjes@...gle.com>,
        Michal Hocko <mhocko@...e.com>,
        Shakeel Butt <shakeelb@...gle.com>,
        linux-mm <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4] mm: shmem: implement POSIX_FADV_[WILL|DONT]NEED for
 shmem

Thanks Suren for your comments!!

On 3/5/2022 2:00 AM, Suren Baghdasaryan wrote:
>> +               test_and_clear_page_young(page);
>> +               SetPageDirty(page);
> I asked Hugh about how clean shmem pages are handled during normal
> reclaim and his reply is:
> 
> Clean shmem pages are rare: they correspond to where a hole in a
> sparse file has been mapped read-only to userspace. Those get dropped
> from pagecache without being written to swap, when memory pressure
> comes to reclaim them. Otherwise, shmem pages are dirty: as soon as
> they've been read from swap and identified as shmem (moved from swap
> cache to page cache), that swap is freed so they're no longer clean
> representations of anything on swap. (Our use of "swap cache" and/or
> "page cache" may have inconsistencies in what I've written: sometimes
> we use them interchangeably, sometimes we're making a distinction.)
> 
> So, IIUC his explanation, you don't really need to mark clean shmem
> pages dirty for FADV_DONTNEED since normal reclaim does not do that
> either.

Thanks for the details here. Will change it in the next patch.


> 
>> +               list_add(&page->lru, list);
>> +               if (need_resched()) {
>> +                       xas_pause(&xas);
>> +                       cond_resched_rcu();
>> +               }
>> +       }
>> +       rcu_read_unlock();
>> +}
>> +
>> +static int shmem_fadvise_dontneed(struct address_space *mapping, loff_t start,
>> +                               loff_t end)
>> +{
>> +       LIST_HEAD(list);
>> +
>> +       if (!shmem_mapping(mapping))
>> +               return -EINVAL;
>> +
>> +       if (!total_swap_pages)
>> +               return 0;
>> +
>> +       lru_add_drain();
>> +       shmem_isolate_pages_range(mapping, start, end, &list);
>> +       reclaim_pages(&list);
>> +
>> +       return 0;
>> +}
>> +
>> +static int shmem_fadvise_willneed(struct address_space *mapping,
>> +                                pgoff_t start, pgoff_t long end)
>> +{
>> +       struct page *page;
>> +       pgoff_t index;
>> +
>> +       xa_for_each_range(&mapping->i_pages, index, page, start, end) {
>> +               if (!xa_is_value(page))
>> +                       continue;
>> +               page = shmem_read_mapping_page(mapping, index);
>> +               if (!IS_ERR(page))
>> +                       put_page(page);
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>> +static int shmem_fadvise(struct file *file, loff_t offset, loff_t len, int advice)
>> +{
>> +       loff_t endbyte;
>> +       pgoff_t start_index;.
>> +       pgoff_t end_index;
>> +       struct address_space *mapping;
>> +       int ret = 0;
>> +
>> +       mapping = file->f_mapping;
>> +       if (!mapping || len < 0)
>> +               return -EINVAL;
>> +
>> +       endbyte = (u64)offset + (u64)len;
>> +       if (!len || endbyte < len)
>> +               endbyte = -1;
>> +       else
>> +               endbyte--;
> The above block is exactly the same as in
> https://elixir.bootlin.com/linux/latest/source/mm/fadvise.c#L73 with
> the exception that generic_fadvise has comments with explanations of
> this math. You might consider consolidating them into a helper
> function to calculate the endbyte.


Yes, I simply copy pasted these. Will try to consolidate them into one.

>