[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aYx1xoY_yeNvTtF2@kbusch-mbp>
Date: Wed, 11 Feb 2026 05:27:50 -0700
From: Keith Busch <kbusch@...nel.org>
To: Junnan Zhang <zhangjn_dev@....com>
Cc: axboe@...nel.dk, hch@....de, linux-kernel@...r.kernel.org,
linux-nvme@...ts.infradead.org, liuyx92@...natelecom.cn,
sagi@...mberg.me, sunshx@...natelecom.cn, yuanql9@...natelecom.cn,
zhangjn11@...natelecom.cn, zhangzl68@...natelecom.cn
Subject: Re: [PATCH] nvme-pci: fix potential I/O hang when CQ is full
On Wed, Feb 11, 2026 at 05:47:44PM +0800, Junnan Zhang wrote:
> On Tue, 10 Feb 2026 16:57:12 +0100, Christoph Hellwig wrote:
>
> > We can't update the CQ head before consuming the CQEs, otherwise
> > the device can reuse them. And devices must not discard completions
> > when there is no completion queue entry, nvme does allow SQs and CQs
> > to be smaller than the number of outstanding commands.
>
> Updating the CQ head before consuming the CQE would not cause the device to
> reuse these entries, as new commands can only be submitted by the driver after
> the CQE is consumed. Therefore, the device does not have the opportunity
> to reuse these entries.
That's just an artifact of how this host implementation constrains its
tag space. It's not a reflection of how the NVMe protocol fundamentally
works.
A full queue is not an error. It's a spec defined condition that the
submitter just has to deal with. The protocol was specifically made to
allow scenarios for dispatching more outstanding commands than the
queues can hold. Your controller is broken.
Powered by blists - more mailing lists