[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210215104020.yyithlo2hkxqvguj@beryllium.lan>
Date: Mon, 15 Feb 2021 11:40:20 +0100
From: Daniel Wagner <dwagner@...e.de>
To: Hannes Reinecke <hare@...e.de>
Cc: Sagi Grimberg <sagi@...mberg.me>, Keith Busch <kbusch@...nel.org>,
Jens Axboe <axboe@...com>, Christoph Hellwig <hch@....de>,
linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] nvme-tcp: Check if request has started before processing
it
On Sat, Feb 13, 2021 at 09:46:41AM +0100, Hannes Reinecke wrote:
> On 2/12/21 10:49 PM, Sagi Grimberg wrote:
> >
> > > > > blk_mq_tag_to_rq() will always return a request if the command_id is
> > > > > in the valid range. Check if the request has been started. If we
> > > > > blindly process the request we might double complete a request which
> > > > > can be fatal.
> > > >
> > > > How did you get to this one? did the controller send a completion for
> > > > a completed/bogus request?
> > >
> > > If that is the case, then that must mean it's possible the driver could
> > > have started the command id just before the bogus completion check. Data
> > > iorruption, right?
'during TCP LIF toggles and aggr relocates' testing the host
crashes. TBH, I do not really know what is happening or what the test
does. Still trying to figure out what's going on. I was just very
surprised how much the code trusts the other side to behave correctly.
> > Yes, which is why I don't think this check is very useful..
>
> I actually view that as a valid protection against spoofed frames.
> Without it it's easy to crash the machine by injecting fake completions with
> random command ids.
In this test scenario it's not even a spoofed frame; maybe just a
confused controller.
Powered by blists - more mailing lists