[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4df229d8-8124-664a-9bc4-6401bc034be1@grimberg.me>
Date: Thu, 6 Apr 2017 08:33:38 +0300
From: Sagi Grimberg <sagi@...mberg.me>
To: Jason Gunthorpe <jgunthorpe@...idianresearch.com>
Cc: Logan Gunthorpe <logang@...tatee.com>,
Christoph Hellwig <hch@....de>,
"James E.J. Bottomley" <jejb@...ux.vnet.ibm.com>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
Jens Axboe <axboe@...nel.dk>,
Steve Wise <swise@...ngridcomputing.com>,
Stephen Bates <sbates@...thlin.com>,
Max Gurtovoy <maxg@...lanox.com>,
Dan Williams <dan.j.williams@...el.com>,
Keith Busch <keith.busch@...el.com>, linux-pci@...r.kernel.org,
linux-scsi@...r.kernel.org, linux-nvme@...ts.infradead.org,
linux-rdma@...r.kernel.org, linux-nvdimm@...1.01.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC 6/8] nvmet: Be careful about using iomem accesses when
dealing with p2pmem
>> Note that the nvme completion queues are still on the host memory, so
>> this means we have lost the ordering between data and completions as
>> they go to different pcie targets.
>
> Hmm, in this simple up/down case with a switch, I think it might
> actually be OK.
>
> Transactions might not complete at the NVMe device before the CPU
> processes the RDMA completion, however due to the PCI-E ordering rules
> new TLPs directed to the NVMe will complete after the RMDA TLPs and
> thus observe the new data. (eg order preserving)
>
> It would be very hard to use P2P if fabric ordering is not preserved..
I think it still can race if the p2p device is connected with more than
a single port to the switch.
Say it's connected via 2 legs, the bar is accessed from leg A and the
data from the disk comes via leg B. In this case, the data is heading
towards the p2p device via leg B (might be congested), the completion
goes directly to the RC, and then the host issues a read from the
bar via leg A. I don't understand what can guarantee ordering here.
Stephen told me that this still guarantees ordering, but I honestly
can't understand how, perhaps someone can explain to me in a simple
way that I can understand.
Powered by blists - more mailing lists