linux-kernel - Re: [PATCH v2 9/9] mm/swap, shmem: use new swapin helper to skip readahead conditionally

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <87h6ivinpp.fsf@yhuang6-desk2.ccr.corp.intel.com>
Date: Tue, 30 Jan 2024 10:01:22 +0800
From: "Huang, Ying" <ying.huang@...el.com>
To: Kairui Song <ryncsn@...il.com>
Cc: linux-mm <linux-mm@...ck.org>,  Andrew Morton
 <akpm@...ux-foundation.org>,  Chris Li <chrisl@...nel.org>,  Hugh Dickins
 <hughd@...gle.com>,  Johannes Weiner <hannes@...xchg.org>,  Matthew Wilcox
 <willy@...radead.org>,  Michal Hocko <mhocko@...e.com>,  Yosry Ahmed
 <yosryahmed@...gle.com>,  David Hildenbrand <david@...hat.com>,  LKML
 <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 9/9] mm/swap, shmem: use new swapin helper to skip
 readahead conditionally

Kairui Song <ryncsn@...il.com> writes:

> On Wed, Jan 10, 2024 at 11:35 AM Kairui Song <ryncsn@...il.com> wrote:
>>
>> Huang, Ying <ying.huang@...el.com> 于2024年1月9日周二 10:05写道：
>> >
>> > Kairui Song <ryncsn@...il.com> writes:
>> >
>> > > From: Kairui Song <kasong@...cent.com>
>> > >
>> > > Currently, shmem uses cluster readahead for all swap backends. Cluster
>> > > readahead is not a good solution for ramdisk based device (ZRAM) at
> all.
>> > >
>> > > After switching to the new helper, most benchmarks showed a good
> result:
>> > >
>> > > - Single file sequence read:
>> > >   perf stat --repeat 20 dd if=/tmpfs/test of=/dev/null bs=1M
> count=8192
>> > >   (/tmpfs/test is a zero filled file, using brd as swap, 4G memcg
> limit)
>> > >   Before: 22.248 +- 0.549
>> > >   After:  22.021 +- 0.684 (-1.1%)
>> > >
>> > > - Random read stress test:
>> > >   fio -name=tmpfs --numjobs=16 --directory=/tmpfs \
>> > >   --size=256m --ioengine=mmap --rw=randread
> --random_distribution=random \
>> > >   --time_based --ramp_time=1m --runtime=5m --group_reporting
>> > >   (using brd as swap, 2G memcg limit)
>> > >
>> > >   Before: 1818MiB/s
>> > >   After:  1888MiB/s (+3.85%)
>> > >
>> > > - Zipf biased random read stress test:
>> > >   fio -name=tmpfs --numjobs=16 --directory=/tmpfs \
>> > >   --size=256m --ioengine=mmap --rw=randread
> --random_distribution=zipf:1.2 \
>> > >   --time_based --ramp_time=1m --runtime=5m --group_reporting
>> > >   (using brd as swap, 2G memcg limit)
>> > >
>> > >   Before: 31.1GiB/s
>> > >   After:  32.3GiB/s (+3.86%)
>> > >
>> > > So cluster readahead doesn't help much even for single sequence read,
>> > > and for random stress test, the performance is better without it.
>> > >
>> > > Considering both memory and swap device will get more fragmented
>> > > slowly, and commonly used ZRAM consumes much more CPU than plain
>> > > ramdisk, false readahead could occur more frequently and waste
>> > > more CPU. Direct SWAP is cheaper, so use the new helper and skip
>> > > read ahead for SWP_SYNCHRONOUS_IO device.
>> >
>> > It's good to take advantage of swap_direct (no readahead).  I also hopes
>> > we can take advantage of VMA based swapin if shmem is accessed via mmap.
>> > That appears possible.
>>
>> Good idea, that should be doable, will update the series.
>
> Hi Ying,
>
> Turns out it's quite complex to do VMA bases swapin readhead for shmem: VMA
> address / Page Tables doesn't contain swapin entry for shmem. For anon page
> simply read nearby page table is easy and good enough, but for shmem, it's
> stored in the inode mapping so the readahead needs to walk the inode
> mapping instead. That's doable but requires more work to make it actually
> usable. I've sent V3 without this feature, worth another series for this
> readahead extension.

Got it.  Thanks for looking at this.

--
Best Regards,
Huang, Ying