linux-kernel - Re: [RFC 6/8] nvmet: Be careful about using iomem accesses when dealing with p2pmem

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Thu, 6 Apr 2017 10:35:21 -0600
From:   Jason Gunthorpe <jgunthorpe@...idianresearch.com>
To:     Sagi Grimberg <sagi@...mberg.me>
Cc:     Logan Gunthorpe <logang@...tatee.com>,
        Christoph Hellwig <hch@....de>,
        "James E.J. Bottomley" <jejb@...ux.vnet.ibm.com>,
        "Martin K. Petersen" <martin.petersen@...cle.com>,
        Jens Axboe <axboe@...nel.dk>,
        Steve Wise <swise@...ngridcomputing.com>,
        Stephen Bates <sbates@...thlin.com>,
        Max Gurtovoy <maxg@...lanox.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Keith Busch <keith.busch@...el.com>, linux-pci@...r.kernel.org,
        linux-scsi@...r.kernel.org, linux-nvme@...ts.infradead.org,
        linux-rdma@...r.kernel.org, linux-nvdimm@...1.01.org,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC 6/8] nvmet: Be careful about using iomem accesses when
 dealing with p2pmem

On Thu, Apr 06, 2017 at 08:33:38AM +0300, Sagi Grimberg wrote:
> 
> >>Note that the nvme completion queues are still on the host memory, so
> >>this means we have lost the ordering between data and completions as
> >>they go to different pcie targets.
> >
> >Hmm, in this simple up/down case with a switch, I think it might
> >actually be OK.
> >
> >Transactions might not complete at the NVMe device before the CPU
> >processes the RDMA completion, however due to the PCI-E ordering rules
> >new TLPs directed to the NVMe will complete after the RMDA TLPs and
> >thus observe the new data. (eg order preserving)
> >
> >It would be very hard to use P2P if fabric ordering is not preserved..
> 
> I think it still can race if the p2p device is connected with more than
> a single port to the switch.
> 
> Say it's connected via 2 legs, the bar is accessed from leg A and the
> data from the disk comes via leg B. In this case, the data is heading
> towards the p2p device via leg B (might be congested), the completion
> goes directly to the RC, and then the host issues a read from the
> bar via leg A. I don't understand what can guarantee ordering here.

Right, this is why I qualified my statement with 'simple up/down case'

Make it any more complex and it clearly stops working sanely, but I
wouldn't worry about unusual PCI-E fabrics at this point..

> Stephen told me that this still guarantees ordering, but I honestly
> can't understand how, perhaps someone can explain to me in a simple
> way that I can understand.

AFAIK PCI-E ordering is explicitly per link, so things that need order
must always traverse the same link.

Jason