[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3enjvepoexpm567kfyz3bxwr4md7xvsrehgt4hoc54pynuhisu@75nxt6b5cmkb>
Date: Wed, 12 Nov 2025 14:16:20 +0900
From: Sergey Senozhatsky <senozhatsky@...omium.org>
To: Yuwen Chen <ywen.chen@...mail.com>
Cc: senozhatsky@...omium.org, akpm@...ux-foundation.org, axboe@...nel.dk,
bgeffon@...gle.com, licayy@...look.com, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org, liumartin@...gle.com, minchan@...nel.org,
richardycc@...gle.com
Subject: Re: [PATCH v4] zram: Implement multi-page write-back
On (25/11/10 15:16), Yuwen Chen wrote:
> On 10 Nov 2025 13:49:26 +0900, Sergey Senozhatsky wrote:
> > As a side note:
> > You almost never do sequential writes to the backing device. The
> > thing is, e.g. when zram is used as swap, page faults happen randomly
> > and free up (slot-free) random page-size chunks (so random bits in
> > zram->bitmap become clear), which then get overwritten (zram simply
> > picks the first available bit from zram->bitmap) during next writeback.
> > There is nothing sequential about that, in systems with sufficiently
> > large uptime and sufficiently frequent writeback/readback events
> > writeback bitmap becomes sparse, which results in random IO, so your
> > test tests an ideal case that almost never happens in practice.
>
> Thank you very much for your reply.
> As you mentioned, the current test data was measured under the condition
> that all writes were sequential writes. In a normal user environment,
> there are a large number of random writes. However, the multiple
> concurrent submissions implemented in this submission still have performance
> advantages for storage devices. I artificially created the worst - case
> scenario (all writes are random writes) with the following code:
>
> for (int i = 0; i < nr_pages; i++)
> alloc_block_bdev(zram);
>
> for (int i = 0; i < nr_pages; i += 2)
> free_block_bdev(zram, i);
Well, technically, I guess that's not the worst case. The worst case
is when writeback races with page-faults/slot-free events, which happen
on opposite sides of the writeback device and which clear ->bitmap bits
on the opposite sides, so for writeback you alternate all the time and
pick either head or tail slots (->bitmap bits). But you don't need to
test it, it's just a note.
The thing that I'm curious about is why does it help for flash storage?
It's not a spinning disk, where seek times dominate the IO time.
> On the physical machine, the measured data is as follows:
> before modification:
> real 0m0.624s
> user 0m0.000s
> sys 0m0.347s
>
> real 0m0.663s
> user 0m0.001s
> sys 0m0.354s
>
> real 0m0.635s
> user 0m0.000s
> sys 0m0.335s
>
> after modification:
> real 0m0.340s
> user 0m0.000s
> sys 0m0.239s
>
> real 0m0.326s
> user 0m0.000s
> sys 0m0.230s
>
> real 0m0.313s
> user 0m0.000s
> sys 0m0.223s
Thanks for testing.
My next question is: what problem do you solve with this? I mean,
do you use it production (somewhere). If so, do you have a rough
number of how many MiBs you writeback and how often, and what's the
performance impact of this patch. Again, if you use it in production.
Powered by blists - more mailing lists