lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 25 Oct 2021 09:21:58 -0700 From: Keith Busch <kbusch@...nel.org> To: Bjorn Helgaas <helgaas@...nel.org> Cc: Li Chen <lchen@...arella.com>, "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>, Lorenzo Pieralisi <lorenzo.pieralisi@....com>, Rob Herring <robh@...nel.org>, "kw@...ux.com" <kw@...ux.com>, Bjorn Helgaas <bhelgaas@...gle.com>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Tom Joseph <tjoseph@...ence.com>, Jens Axboe <axboe@...com>, Christoph Hellwig <hch@....de>, Sagi Grimberg <sagi@...mberg.me>, linux-nvme@...ts.infradead.org Subject: Re: nvme may get timeout from dd when using different non-prefetch mmio outbound/ranges On Mon, Oct 25, 2021 at 10:47:39AM -0500, Bjorn Helgaas wrote: > [+cc Tom (Cadence maintainer), NVMe folks] > > On Fri, Oct 22, 2021 at 10:08:20AM +0000, Li Chen wrote: > > pciec: pcie-controller@...0000000 { > > compatible = "cdns,cdns-pcie-host"; > > device_type = "pci"; > > #address-cells = <3>; > > #size-cells = <2>; > > bus-range = <0 5>; > > linux,pci-domain = <0>; > > cdns,no-bar-match-nbits = <38>; > > vendor-id = <0x17cd>; > > device-id = <0x0100>; > > reg-names = "reg", "cfg"; > > reg = <0x20 0x40000000 0x0 0x10000000>, > > <0x20 0x00000000 0x0 0x00001000>; /* RC only */ > > > > /* > > * type: 0x00000000 cfg space > > * type: 0x01000000 IO > > * type: 0x02000000 32bit mem space No prefetch > > * type: 0x03000000 64bit mem space No prefetch > > * type: 0x43000000 64bit mem space prefetch > > * The First 16MB from BUS_DEV_FUNC=0:0:0 for cfg space > > * <0x00000000 0x00 0x00000000 0x20 0x00000000 0x00 0x01000000>, CFG_SPACE > > */ > > ranges = <0x01000000 0x00 0x00000000 0x20 0x00100000 0x00 0x00100000>, > > <0x02000000 0x00 0x08000000 0x20 0x08000000 0x00 0x08000000>; > > > > #interrupt-cells = <0x1>; > > interrupt-map-mask = <0x00 0x0 0x0 0x7>; > > interrupt-map = <0x0 0x0 0x0 0x1 &gic 0 229 0x4>, > > <0x0 0x0 0x0 0x2 &gic 0 230 0x4>, > > <0x0 0x0 0x0 0x3 &gic 0 231 0x4>, > > <0x0 0x0 0x0 0x4 &gic 0 232 0x4>; > > phys = <&pcie_phy>; > > phy-names="pcie-phy"; > > status = "ok"; > > }; > > > > > > After some digging, I find if I change the controller's range > > property from > > > > <0x02000000 0x00 0x08000000 0x20 0x08000000 0x00 0x08000000> into > > <0x02000000 0x00 0x00400000 0x20 0x00400000 0x00 0x08000000>, > > > > then dd will success without timeout. IIUC, range here > > is only for non-prefetch 32bit mmio, but dd will use dma (maybe cpu > > will send cmd to nvme controller via mmio?). Generally speaking, an nvme driver notifies the controller of new commands via a MMIO write to a specific nvme register. The nvme controller fetches those commands from host memory with a DMA. One exception to that description is if the nvme controller supports CMB with SQEs, but they're not very common. If you had such a controller, the driver will use MMIO to write commands directly into controller memory instead of letting the controller DMA them from host memory. Do you know if you have such a controller? The data transfers associated with your 'dd' command will always use DMA. > I don't know how to interpret "ranges". Can you supply the dmesg and > "lspci -vvs 0000:05:00.0" output both ways, e.g., > > pci_bus 0000:00: root bus resource [mem 0x7f800000-0xefffffff window] > pci_bus 0000:00: root bus resource [mem 0xfd000000-0xfe7fffff window] > pci 0000:05:00.0: [vvvv:dddd] type 00 class 0x... > pci 0000:05:00.0: reg 0x10: [mem 0x.....000-0x.....fff ...] > > > Question: > > 1. Why dd can cause nvme timeout? Is there more debug ways? That means the nvme controller didn't provide a response to a posted command within the driver's latency tolerance. > > 2. How can this mmio range affect nvme timeout? Let's see how those ranges affect what the kernel sees in the pci topology, as Bjorn suggested.
Powered by blists - more mailing lists