[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210507204052.GA1485586@dhcp-10-100-145-180.wdc.com>
Date: Fri, 7 May 2021 13:40:52 -0700
From: Keith Busch <kbusch@...nel.org>
To: Sagi Grimberg <sagi@...mberg.me>
Cc: Hannes Reinecke <hare@...e.de>,
"Ewan D. Milne" <emilne@...hat.com>,
Daniel Wagner <dwagner@...e.de>,
linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org,
Jens Axboe <axboe@...com>, Christoph Hellwig <hch@....de>
Subject: Re: [PATCH v2] nvme-tcp: Check if request has started before
processing it
On Fri, May 07, 2021 at 01:26:11PM -0700, Sagi Grimberg wrote:
>
> > > > > I would also mention, that a crash is not even the scariest issue that
> > > > > we can see here, because if the request happened to be reused we are
> > > > > in the silent data corruption realm...
> > > >
> > > > If this does happen, I think we have to come up with some way to
> > > > mitigate it. We're not utilizing the full 16 bits of the command_id, so
> > > > maybe we can append something like a generation sequence number that can
> > > > be checked for validity.
> > >
> > > That's actually a great idea. scsi needs unique tags so it encodes the
> > > hwq in the upper 16 bits giving the actual tag the lower 16 bits which
> > > is more than enough for a single queue. We can do the same with
> > > a gencnt that will increment in both submission and completion and we
> > > can validate against it.
> > >
> > > This will be useful for all transports, so maintaining it in
> > > nvme_req(rq)->genctr and introducing a helper like:
> > > rq = nvme_find_tag(tagset, cqe->command_id)
> > > That will filter genctr, locate the request.
> > >
> > > Also:
> > > nvme_validate_request_gen(rq, cqe->command_id) that would
> > > compare against it.
> > >
> > >
> > > And then a helper to set the command_id like:
> > > cmd->common.command_id = nvme_request_command_id(rq)
> > > that will both increment the genctr and build a command_id
> > > from it.
> > >
> > > Thoughts?
> > >
> >
> > Well, that would require a modification to the CQE specification, no?
> > fmds was not amused when I proposed that :-(
>
> Why would that require a modification to the CQE? it's just using say
> 4 msbits of the command_id to a running sequence...
I think Hannes was under the impression that the counter proposal wasn't
part of the "command_id". The host can encode whatever it wants in that
value, and the controller just has to return the same value.
Powered by blists - more mailing lists