[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <569825cd-c98f-4399-ad25-d4e62fba4255@kernel.dk>
Date: Thu, 13 Nov 2025 13:43:09 -0700
From: Jens Axboe <axboe@...nel.dk>
To: Leon Romanovsky <leon@...nel.org>
Cc: Keith Busch <kbusch@...nel.org>, Christoph Hellwig <hch@....de>,
Sagi Grimberg <sagi@...mberg.me>, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-nvme@...ts.infradead.org
Subject: Re: [PATCH v4 0/2] block: Enable proper MMIO memory handling for P2P
DMA
On 11/13/25 12:50 PM, Leon Romanovsky wrote:
> On Thu, Nov 13, 2025 at 10:45:53AM -0700, Jens Axboe wrote:
>> On 11/13/25 10:12 AM, Jens Axboe wrote:
>>> On 11/13/25 9:39 AM, Jens Axboe wrote:
>>>>
>>>> On Wed, 12 Nov 2025 21:48:03 +0200, Leon Romanovsky wrote:
>>>>> Changelog:
>>>>> v4:
>>>>> * Changed double "if" to be "else if".
>>>>> * Added missed PCI_P2PDMA_MAP_NONE case.
>>>>> v3: https://patch.msgid.link/20251027-block-with-mmio-v3-0-ac3370e1f7b7@nvidia.com
>>>>> * Encoded p2p map type in IOD flags instead of DMA attributes.
>>>>> * Removed REQ_P2PDMA flag from block layer.
>>>>> * Simplified map_phys conversion patch.
>>>>> v2: https://lore.kernel.org/all/20251020-block-with-mmio-v2-0-147e9f93d8d4@nvidia.com/
>>>>> * Added Chirstoph's Reviewed-by tag for first patch.
>>>>> * Squashed patches
>>>>> * Stored DMA MMIO attribute in NVMe IOD flags variable instead of block layer.
>>>>> v1: https://patch.msgid.link/20251017-block-with-mmio-v1-0-3f486904db5e@nvidia.com
>>>>> * Reordered patches.
>>>>> * Dropped patch which tried to unify unmap flow.
>>>>> * Set MMIO flag separately for data and integrity payloads.
>>>>> v0: https://lore.kernel.org/all/cover.1760369219.git.leon@kernel.org/
>>>>>
>>>>> [...]
>>>>
>>>> Applied, thanks!
>>>>
>>>> [1/2] nvme-pci: migrate to dma_map_phys instead of map_page
>>>> commit: f10000db2f7cf29d8c2ade69266bed7b51c772cb
>>>> [2/2] block-dma: properly take MMIO path
>>>> commit: 8df2745e8b23fdbe34c5b0a24607f5aaf10ed7eb
>>>
>>> And now dropped again - this doesn't boot on neither my big test box
>>> with 33 nvme drives, nor even on my local test vm. Two different archs,
>>> and very different setups. Which begs the question, how on earth was
>>> this tested, if it doesn't boot on anything I have here?!
>>
>> I took a look, and what happens here is that iter.p2pdma.map is 0 as it
>> never got set to anything. That is the same as PCI_P2PDMA_MAP_UNKNOWN,
>> and hence we just end up in a BLK_STS_RESOURCE. First of all, returning
>> BLK_STS_RESOURCE for that seems... highly suspicious. That should surely
>> be a fatal error. And secondly, this just further backs up that there's
>> ZERO testing done on this patchset at all. WTF?
>>
>> FWIW, the below makes it boot just fine, as expected, as a default zero
>> filled iter then matches the UNKNOWN case.
>>
>>
>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>> index e5ca8301bb8b..4cce69226773 100644
>> --- a/drivers/nvme/host/pci.c
>> +++ b/drivers/nvme/host/pci.c
>> @@ -1087,6 +1087,7 @@ static blk_status_t nvme_map_data(struct request *req)
>> case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE:
>> iod->flags |= IOD_DATA_MMIO;
>> break;
>> + case PCI_P2PDMA_MAP_UNKNOWN:
>> case PCI_P2PDMA_MAP_NONE:
>> break;
>> default:
>> @@ -1122,6 +1123,7 @@ static blk_status_t nvme_pci_setup_meta_iter(struct request *req)
>> case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE:
>> iod->flags |= IOD_META_MMIO;
>> break;
>> + case PCI_P2PDMA_MAP_UNKNOWN:
>> case PCI_P2PDMA_MAP_NONE:
>> break;
>> default:
>
> Sorry for troubles.
>
> Can you please squash this fixup instead?
> diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
> index 98554929507a..807048644f2e 100644
> --- a/block/blk-mq-dma.c
> +++ b/block/blk-mq-dma.c
> @@ -172,6 +172,7 @@ static bool blk_dma_map_iter_start(struct request *req, struct device *dma_dev,
>
> memset(&iter->p2pdma, 0, sizeof(iter->p2pdma));
> iter->status = BLK_STS_OK;
> + iter->p2pdma.map = PCI_P2PDMA_MAP_NONE;
>
> /*
> * Grab the first segment ASAP because we'll need it to check for P2P
Please send out a v5, and then also base it on the current tree. I had
to hand apply one hunk on v4 because it didn't apply directly. Because
another patch from 9 days ago modified it.
I do agree that this should go elsewhere, but I don't think there's much
of an issue doing it on the block side for now. That can then get killed
when PCI does it.
--
Jens Axboe
Powered by blists - more mailing lists