[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aXdjqueZ8d8Se61A@infradead.org>
Date: Mon, 26 Jan 2026 04:52:58 -0800
From: Christoph Hellwig <hch@...radead.org>
To: zhangshida2026@....com
Cc: colyli@...as.com, kent.overstreet@...ux.dev, hch@...radead.org,
axboe@...nel.dk, osandov@...com, bvanassche@....org,
linux-bcache@...r.kernel.org, linux-kernel@...r.kernel.org,
zhangshida@...inos.cn, starzhangzsd@...il.com
Subject: Re: [PATCH v2] bcache: fix I/O accounting leak in
detached_dev_do_request
On Mon, Jan 26, 2026 at 05:28:54PM +0800, zhangshida2026@....com wrote:
> From: Shida Zhang <zhangshida@...inos.cn>
>
> When a bcache device is in a detached state, iostat can show 100%
> utilization even after I/O workload completion.
>
> This happens because the caller, cached_dev_make_request(), calls
> bio_start_io_acct() to begin accounting. However, if the bio hits an
> early exit path in detached_dev_do_request()—either due to an
> unsupported discard request or a bio_alloc_clone() failure—the
> corresponding bio_end_io_acct() is never called. This leaves the
> in-flight counter permanently incremented, causing the kernel to
> report the device as 100% busy.
>
> Add the missing bio_end_io_acct() calls to these error/early-exit
> paths to ensure proper I/O accounting.
>
> Fixes: d62e26b3ffd28 ("block: pass in queue to inflight accounting")
I don't think that is correct. This was just a trivial calling
convention change.
>From doing a quick git-blame chain this looks like the culprit:
bc082a55d25c837341709accaf11311c3a9af727
Author: Tang Junhui <tang.junhui@....com.cn>
Date: Sun Mar 18 17:36:19 2018 -0700
bcache: fix inaccurate io state for detached bcache devices
> + bio_end_io_acct(orig_bio, start_time);
> bio_endio(orig_bio);
> return;
> }
> @@ -1114,6 +1115,7 @@ static void detached_dev_do_request(struct bcache_device *d,
> clone_bio = bio_alloc_clone(dc->bdev, orig_bio, GFP_NOIO,
> &d->bio_detached);
> if (!clone_bio) {
> + bio_end_io_acct(orig_bio, start_time);
> orig_bio->bi_status = BLK_STS_RESOURCE;
> bio_endio(orig_bio);
> return;
This is begging to use a goto label to share code, if it weren't for the
fact that bio_alloc_clone with GFP_NOIO will never return NULL because
both because the bio itself and the crypt or integrity information are
backed by mempool.
So this second copy of the code is actually dead and should be removed
in a prep patch before this one. Sorry for not catching this earlier.
Powered by blists - more mailing lists