linux-kernel - Re: [PATCH v1 00/11] dm-pcache – persistent-memory cache for block devices

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <8d383dc6-819b-2c7f-bab5-2cd113ed9ece@redhat.com>
Date: Mon, 30 Jun 2025 15:30:48 +0200 (CEST)
From: Mikulas Patocka <mpatocka@...hat.com>
To: Dongsheng Yang <dongsheng.yang@...ux.dev>
cc: agk@...hat.com, snitzer@...nel.org, axboe@...nel.dk, hch@....de, 
    dan.j.williams@...el.com, Jonathan.Cameron@...wei.com, 
    linux-block@...r.kernel.org, linux-kernel@...r.kernel.org, 
    linux-cxl@...r.kernel.org, nvdimm@...ts.linux.dev, 
    dm-devel@...ts.linux.dev
Subject: Re: [PATCH v1 00/11] dm-pcache – persistent-memory cache for block devices

On Tue, 24 Jun 2025, Dongsheng Yang wrote:

> Hi Mikulas,
> 	This is V1 for dm-pcache, please take a look.
> 
> Code:
>     https://github.com/DataTravelGuide/linux tags/pcache_v1
> 
> Changelogs from RFC-V2:
> 	- use crc32c to replace crc32
> 	- only retry pcache_req when cache full, add pcache_req into defer_list,
> 	  and wait cache invalidation happen.
> 	- new format for pcache table, it is more easily extended with
> 	  new parameters later.
> 	- remove __packed.
> 	- use spin_lock_irq in req_complete_fn to replace
> 	  spin_lock_irqsave.
> 	- fix bug in backing_dev_bio_end with spin_lock_irqsave.
> 	- queue_work() inside spinlock.
> 	- introduce inline_bvecs in backing_dev_req.
> 	- use kmalloc_array for bvecs allocation.
> 	- calculate ->off with dm_target_offset() before use it.

Hi

The out-of-memory handling still doesn't seem right.

If the GFP_NOWAIT allocation doesn't succeed (which may happen anytime, 
for example it happens when the machine is receiving network packets 
faster than the swapper is able to swap out data), create_cache_miss_req 
returns NULL, the caller changes it to -ENOMEM, cache_read returns 
-ENOMEM, -ENOMEM is propagated up to end_req and end_req will set the 
status to BLK_STS_RESOURCE. So, it may randomly fail I/Os with an error.

Properly, you should use mempools. The mempool allocation will wait until 
some other process frees data into the mempool.

If you need to allocate memory inside a spinlock, you can't do it reliably 
(because you can't sleep inside a spinlock and non-sleepng memory 
allocation may fail anytime). So, in this case, you should drop the 
spinlock, allocate the memory from a mempool with GFP_NOIO and jump back 
to grab the spinlock - and now you holding the allocated object, so you 
can use it while you hold the spinlock.

Another comment:
set_bit/clear_bit use atomic instructions which are slow. As you already 
hold a spinlock when calling them, you don't need the atomicity, so you 
can replace them with __set_bit and __clear_bit.

Mikulas