[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3bdad772-9710-2763-c9a3-fefb3723fdf6@redhat.com>
Date: Tue, 22 Apr 2025 12:29:27 +0200 (CEST)
From: Mikulas Patocka <mpatocka@...hat.com>
To: Dongsheng Yang <dongsheng.yang@...ux.dev>
cc: Jens Axboe <axboe@...nel.dk>, Dan Williams <dan.j.williams@...el.com>,
hch@....de, gregory.price@...verge.com, John@...ves.net,
Jonathan.Cameron@...wei.com, bbhushan2@...vell.com, chaitanyak@...dia.com,
rdunlap@...radead.org, agk@...hat.com, snitzer@...nel.org,
linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-cxl@...r.kernel.org, linux-bcache@...r.kernel.org,
nvdimm@...ts.linux.dev, dm-devel@...ts.linux.dev
Subject: Re: [RFC PATCH 00/11] pcache: Persistent Memory Cache for Block
Devices
Hi
On Thu, 17 Apr 2025, Dongsheng Yang wrote:
> +ccing md-devel
>
> On 2025/4/16 23:10, Jens Axboe wrote:
> > On 4/16/25 12:08 AM, Dongsheng Yang wrote:
> > > On 2025/4/16 9:04, Jens Axboe wrote:
> > > > On 4/15/25 12:00 PM, Dan Williams wrote:
> > > > > Thanks for making the comparison chart. The immediate question this
> > > > > raises is why not add "multi-tree per backend", "log structured
> > > > > writeback", "readcache", and "CRC" support to dm-writecache?
> > > > > device-mapper is everywhere, has a long track record, and enhancing it
> > > > > immediately engages a community of folks in this space.
> > > > Strongly agree.
> > >
> > > Hi Dan and Jens,
> > > Thanks for your reply, that's a good question.
> > >
> > > 1. Why not optimize within dm-writecache?
> > > From my perspective, the design goal of dm-writecache is to be a
> > > minimal write cache. It achieves caching by dividing the cache device
> > > into n blocks, each managed by a wc_entry, using a very simple
> > > management mechanism. On top of this design, it's quite difficult to
> > > implement features like multi-tree structures, CRC, or log-structured
> > > writeback. Moreover, adding such optimizations?especially a read
> > > cache?would deviate from the original semantics of dm-writecache. So,
> > > we didn't consider optimizing dm-writecache to meet our goals.
> > >
> > > 2. Why not optimize within bcache or dm-cache?
> > > As mentioned above, dm-writecache is essentially a minimal write
> > > cache. So, why not build on bcache or dm-cache, which are more
> > > complete caching systems? The truth is, it's also quite difficult.
> > > These systems were designed with traditional SSDs/NVMe in mind, and
> > > many of their design assumptions no longer hold true in the context of
> > > PMEM. Every design targets a specific scenario, which is why, even
> > > with dm-cache available, dm-writecache emerged to support DAX-capable
> > > PMEM devices.
> > >
> > > 3. Then why not implement a full PMEM cache within the dm framework?
> > > In high-performance IO scenarios?especially with PMEM hardware?adding
> > > an extra DM layer in the IO stack is often unnecessary. For example,
> > > DM performs a bio clone before calling __map_bio(clone) to invoke the
> > > target operation, which introduces overhead.
Device mapper performs (in the common fast case) one allocation per
incoming bio - the allocation contains the outgoing bio and a structure
that may be used for any purpose by the target driver. For interlocking,
it uses RCU, so there is no synchronizing instruction. So, DM overhead is
not big.
> > > Thank you again for the suggestion. I absolutely agree that leveraging
> > > existing frameworks would be helpful in terms of code review, and
> > > merging. I, more than anyone, hope more people can help review the
> > > code or join in this work. However, I believe that in the long run,
> > > building a standalone pcache module is a better choice.
> > I think we'd need much stronger reasons for NOT adopting some kind of dm
> > approach for this, this is really the place to do it. If dm-writecache
> > etc aren't a good fit, add a dm-whatevercache for it? If dm is
> > unnecessarily cloning bios when it doesn't need to, then that seems like
> > something that would be worthwhile fixing in the first place, or at
> > least eliminate for cases that don't need it. That'd benefit everyone,
> > and we would not be stuck with a new stack to manage.
> >
> > Would certainly be worth exploring with the dm folks.
>
> well, introducing dm-pcache (assuming we use this name) could, on one hand,
> attract more users and developers from the device-mapper community to pay
> attention to this project, and on the other hand, serve as a way to validate
> or improve the dm framework’s performance in high-performance I/O scenarios.
> If necessary, we can enhance the dm framework instead of bypassing it
> entirely. This indeed sounds like something that would “benefit everyone.”
>
> Hmm, I will seriously consider this approach.
>
> Hi Alasdair, Mike, Mikulas, Do you have any suggestions?
>
> Thanx
If you create a new self-contained target that doesn't need changes in the
generic dm or block code, it's OK and I would accept that.
Improving dm-writecache is also possible.
Mikulas
Powered by blists - more mailing lists