linux-kernel - Re: [PATCH] nvme-tcp: Check if request has started before processing it

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a2064070-b511-ba6d-bd64-0b3abc208356@grimberg.me>
Date:   Mon, 15 Feb 2021 13:29:45 -0800
From:   Sagi Grimberg <sagi@...mberg.me>
To:     Daniel Wagner <dwagner@...e.de>, Hannes Reinecke <hare@...e.de>
Cc:     Keith Busch <kbusch@...nel.org>, Jens Axboe <axboe@...com>,
        Christoph Hellwig <hch@....de>, linux-nvme@...ts.infradead.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] nvme-tcp: Check if request has started before processing
 it


>>>>>> blk_mq_tag_to_rq() will always return a request if the command_id is
>>>>>> in the valid range. Check if the request has been started. If we
>>>>>> blindly process the request we might double complete a request which
>>>>>> can be fatal.
>>>>>
>>>>> How did you get to this one? did the controller send a completion for
>>>>> a completed/bogus request?
>>>>
>>>> If that is the case, then that must mean it's possible the driver could
>>>> have started the command id just before the bogus completion check. Data
>>>> iorruption, right?
> 
> 'during TCP LIF toggles and aggr relocates' testing the host
> crashes. TBH, I do not really know what is happening or what the test
> does. Still trying to figure out what's going on.

Well, I think we should probably figure out why that is happening first.

> I was just very
> surprised how much the code trusts the other side to behave correctly./

What does pci/rdma/fc does differently? What does scsi do here?

>>> Yes, which is why I don't think this check is very useful..
>>
>> I actually view that as a valid protection against spoofed frames.
>> Without it it's easy to crash the machine by injecting fake completions with
>> random command ids.
> 
> In this test scenario it's not even a spoofed frame; maybe just a
> confused controller.

Maybe... I am still not sure how this patch helps here though...