[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ba009e79-e82a-478d-bf2b-52b964141c11@nvidia.com>
Date: Wed, 16 Jul 2025 21:56:47 +0000
From: Chaitanya Kulkarni <chaitanyak@...dia.com>
To: Rick Wertenbroek <rick.wertenbroek@...il.com>
CC: "rick.wertenbroek@...g-vd.ch" <rick.wertenbroek@...g-vd.ch>,
"dlemoal@...nel.org" <dlemoal@...nel.org>, "alberto.dassatti@...g-vd.ch"
<alberto.dassatti@...g-vd.ch>, "stable@...r.kernel.org"
<stable@...r.kernel.org>, Christoph Hellwig <hch@....de>, Sagi Grimberg
<sagi@...mberg.me>, Chaitanya Kulkarni <chaitanyak@...dia.com>,
Krzysztof WilczyĆski <kwilczynski@...nel.org>, Manivannan
Sadhasivam <mani@...nel.org>, Keith Busch <kbusch@...nel.org>,
"linux-nvme@...ts.infradead.org" <linux-nvme@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 1/1] nvmet: pci-epf: Do not complete commands twice if
nvmet_req_init() fails
On 7/16/25 04:15, Rick Wertenbroek wrote:
> Have nvmet_req_init() and req->execute() complete failed commands.
>
> Description of the problem:
> nvmet_req_init() calls __nvmet_req_complete() internally upon failure,
> e.g., unsupported opcode, which calls the "queue_response" callback,
> this results in nvmet_pci_epf_queue_response() being called, which will
> call nvmet_pci_epf_complete_iod() if data_len is 0 or if dma_dir is
> different from DMA_TO_DEVICE. This results in a double completion as
> nvmet_pci_epf_exec_iod_work() also calls nvmet_pci_epf_complete_iod()
> when nvmet_req_init() fails.
>
> Steps to reproduce:
> On the host send a command with an unsupported opcode with nvme-cli,
> For example the admin command "security receive"
> $ sudo nvme security-recv /dev/nvme0n1 -n1 -x4096
>
> This triggers a double completion as nvmet_req_init() fails and
> nvmet_pci_epf_queue_response() is called, here iod->dma_dir is still
> in the default state of "DMA_NONE" as set by default in
> nvmet_pci_epf_alloc_iod(), so nvmet_pci_epf_complete_iod() is called.
> Because nvmet_req_init() failed nvmet_pci_epf_complete_iod() is also
> called in nvmet_pci_epf_exec_iod_work() leading to a double completion.
> This not only sends two completions to the host but also corrupts the
> state of the PCI NVMe target leading to kernel oops.
>
> This patch lets nvmet_req_init() and req->execute() complete all failed
> commands, and removes the double completion case in
> nvmet_pci_epf_exec_iod_work() therefore fixing the edge cases where
> double completions occurred.
>
> Signed-off-by: Rick Wertenbroek<rick.wertenbroek@...il.com>
> Reviewed-by: Damien Le Moal<dlemoal@...nel.org>
> Fixes: 0faa0fe6f90e ("nvmet: New NVMe PCI endpoint function target driver")
> Cc:stable@...r.kernel.org
Good catch, looks good, I wish we have tests for this part of target
to it will get tested on regular basis, not the requirement, just
a thought.
Reviewed-by: Chaitanya Kulkarni <kch@...dia.com>
-ck
Powered by blists - more mailing lists