[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3b9a2b25-9476-4806-be91-254dda385f38@linux.dev>
Date: Tue, 22 Apr 2025 21:23:52 +0800
From: Dongsheng Yang <dongsheng.yang@...ux.dev>
To: Mikulas Patocka <mpatocka@...hat.com>
Cc: Jens Axboe <axboe@...nel.dk>, Dan Williams <dan.j.williams@...el.com>,
hch@....de, gregory.price@...verge.com, John@...ves.net,
Jonathan.Cameron@...wei.com, bbhushan2@...vell.com, chaitanyak@...dia.com,
rdunlap@...radead.org, agk@...hat.com, snitzer@...nel.org,
linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-cxl@...r.kernel.org, linux-bcache@...r.kernel.org,
nvdimm@...ts.linux.dev, dm-devel@...ts.linux.dev
Subject: Re: [RFC PATCH 00/11] pcache: Persistent Memory Cache for Block
Devices
On 2025/4/22 18:29, Mikulas Patocka wrote:
> Hi
>
>
> On Thu, 17 Apr 2025, Dongsheng Yang wrote:
>
>> +ccing md-devel
>>
>> On 2025/4/16 23:10, Jens Axboe wrote:
>>> On 4/16/25 12:08 AM, Dongsheng Yang wrote:
>>>> On 2025/4/16 9:04, Jens Axboe wrote:
>>>>> On 4/15/25 12:00 PM, Dan Williams wrote:
>>>>>> Thanks for making the comparison chart. The immediate question this
>>>>>> raises is why not add "multi-tree per backend", "log structured
>>>>>> writeback", "readcache", and "CRC" support to dm-writecache?
>>>>>> device-mapper is everywhere, has a long track record, and enhancing it
>>>>>> immediately engages a community of folks in this space.
>>>>> Strongly agree.
>>>> Hi Dan and Jens,
>>>> Thanks for your reply, that's a good question.
>>>>
>>>> 1. Why not optimize within dm-writecache?
>>>> From my perspective, the design goal of dm-writecache is to be a
>>>> minimal write cache. It achieves caching by dividing the cache device
>>>> into n blocks, each managed by a wc_entry, using a very simple
>>>> management mechanism. On top of this design, it's quite difficult to
>>>> implement features like multi-tree structures, CRC, or log-structured
>>>> writeback. Moreover, adding such optimizations?especially a read
>>>> cache?would deviate from the original semantics of dm-writecache. So,
>>>> we didn't consider optimizing dm-writecache to meet our goals.
>>>>
>>>> 2. Why not optimize within bcache or dm-cache?
>>>> As mentioned above, dm-writecache is essentially a minimal write
>>>> cache. So, why not build on bcache or dm-cache, which are more
>>>> complete caching systems? The truth is, it's also quite difficult.
>>>> These systems were designed with traditional SSDs/NVMe in mind, and
>>>> many of their design assumptions no longer hold true in the context of
>>>> PMEM. Every design targets a specific scenario, which is why, even
>>>> with dm-cache available, dm-writecache emerged to support DAX-capable
>>>> PMEM devices.
>>>>
>>>> 3. Then why not implement a full PMEM cache within the dm framework?
>>>> In high-performance IO scenarios?especially with PMEM hardware?adding
>>>> an extra DM layer in the IO stack is often unnecessary. For example,
>>>> DM performs a bio clone before calling __map_bio(clone) to invoke the
>>>> target operation, which introduces overhead.
> Device mapper performs (in the common fast case) one allocation per
> incoming bio - the allocation contains the outgoing bio and a structure
> that may be used for any purpose by the target driver. For interlocking,
> it uses RCU, so there is no synchronizing instruction. So, DM overhead is
> not big.
>
>>>> Thank you again for the suggestion. I absolutely agree that leveraging
>>>> existing frameworks would be helpful in terms of code review, and
>>>> merging. I, more than anyone, hope more people can help review the
>>>> code or join in this work. However, I believe that in the long run,
>>>> building a standalone pcache module is a better choice.
>>> I think we'd need much stronger reasons for NOT adopting some kind of dm
>>> approach for this, this is really the place to do it. If dm-writecache
>>> etc aren't a good fit, add a dm-whatevercache for it? If dm is
>>> unnecessarily cloning bios when it doesn't need to, then that seems like
>>> something that would be worthwhile fixing in the first place, or at
>>> least eliminate for cases that don't need it. That'd benefit everyone,
>>> and we would not be stuck with a new stack to manage.
>>>
>>> Would certainly be worth exploring with the dm folks.
>> well, introducing dm-pcache (assuming we use this name) could, on one hand,
>> attract more users and developers from the device-mapper community to pay
>> attention to this project, and on the other hand, serve as a way to validate
>> or improve the dm framework’s performance in high-performance I/O scenarios.
>> If necessary, we can enhance the dm framework instead of bypassing it
>> entirely. This indeed sounds like something that would “benefit everyone.”
>>
>> Hmm, I will seriously consider this approach.
>>
>> Hi Alasdair, Mike, Mikulas, Do you have any suggestions?
>>
>> Thanx
> If you create a new self-contained target that doesn't need changes in the
> generic dm or block code, it's OK and I would accept that.
I will try to port pcache into dm to be a new self-contained target.
Thanx
Dongsheng
>
> Improving dm-writecache is also possible.
>
> Mikulas
Powered by blists - more mailing lists