[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8d383dc6-819b-2c7f-bab5-2cd113ed9ece@redhat.com>
Date: Mon, 30 Jun 2025 15:30:48 +0200 (CEST)
From: Mikulas Patocka <mpatocka@...hat.com>
To: Dongsheng Yang <dongsheng.yang@...ux.dev>
cc: agk@...hat.com, snitzer@...nel.org, axboe@...nel.dk, hch@....de,
dan.j.williams@...el.com, Jonathan.Cameron@...wei.com,
linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-cxl@...r.kernel.org, nvdimm@...ts.linux.dev,
dm-devel@...ts.linux.dev
Subject: Re: [PATCH v1 00/11] dm-pcache – persistent-memory cache for block devices
On Tue, 24 Jun 2025, Dongsheng Yang wrote:
> Hi Mikulas,
> This is V1 for dm-pcache, please take a look.
>
> Code:
> https://github.com/DataTravelGuide/linux tags/pcache_v1
>
> Changelogs from RFC-V2:
> - use crc32c to replace crc32
> - only retry pcache_req when cache full, add pcache_req into defer_list,
> and wait cache invalidation happen.
> - new format for pcache table, it is more easily extended with
> new parameters later.
> - remove __packed.
> - use spin_lock_irq in req_complete_fn to replace
> spin_lock_irqsave.
> - fix bug in backing_dev_bio_end with spin_lock_irqsave.
> - queue_work() inside spinlock.
> - introduce inline_bvecs in backing_dev_req.
> - use kmalloc_array for bvecs allocation.
> - calculate ->off with dm_target_offset() before use it.
Hi
The out-of-memory handling still doesn't seem right.
If the GFP_NOWAIT allocation doesn't succeed (which may happen anytime,
for example it happens when the machine is receiving network packets
faster than the swapper is able to swap out data), create_cache_miss_req
returns NULL, the caller changes it to -ENOMEM, cache_read returns
-ENOMEM, -ENOMEM is propagated up to end_req and end_req will set the
status to BLK_STS_RESOURCE. So, it may randomly fail I/Os with an error.
Properly, you should use mempools. The mempool allocation will wait until
some other process frees data into the mempool.
If you need to allocate memory inside a spinlock, you can't do it reliably
(because you can't sleep inside a spinlock and non-sleepng memory
allocation may fail anytime). So, in this case, you should drop the
spinlock, allocate the memory from a mempool with GFP_NOIO and jump back
to grab the spinlock - and now you holding the allocated object, so you
can use it while you hold the spinlock.
Another comment:
set_bit/clear_bit use atomic instructions which are slow. As you already
hold a spinlock when calling them, you don't need the atomicity, so you
can replace them with __set_bit and __clear_bit.
Mikulas
Powered by blists - more mailing lists