[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a3d8cd82-91ac-0d46-d7d2-c444062a199c@linux.ibm.com>
Date: Wed, 27 Sep 2023 16:25:28 -0400
From: Matthew Rosato <mjrosato@...ux.ibm.com>
To: Jason Gunthorpe <jgg@...pe.ca>, Niklas Schnelle <schnelle@...ux.ibm.com>
Cc: Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
Wenjia Zhang <wenjia@...ux.ibm.com>,
Robin Murphy <robin.murphy@....com>, Gerd Bayer <gbayer@...ux.ibm.com>,
Julian Ruess <julianr@...ux.ibm.com>,
Pierre Morel <pmorel@...ux.ibm.com>,
Alexandra Winter
<wintera@...ux.ibm.com>,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Alexander Gordeev
<agordeev@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>,
Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
Hector Martin <marcan@...can.st>, Sven Peter <sven@...npeter.dev>,
Alyssa Rosenzweig <alyssa@...enzweig.io>,
David Woodhouse <dwmw2@...radead.org>,
Lu Baolu <baolu.lu@...ux.intel.com>, Andy Gross <agross@...nel.org>,
Bjorn Andersson <andersson@...nel.org>,
Konrad Dybcio <konrad.dybcio@...aro.org>,
Yong Wu <yong.wu@...iatek.com>,
Matthias Brugger <matthias.bgg@...il.com>,
AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>,
Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
Orson Zhai <orsonzhai@...il.com>,
Baolin Wang
<baolin.wang@...ux.alibaba.com>,
Chunyan Zhang <zhang.lyra@...il.com>, Chen-Yu Tsai <wens@...e.org>,
Jernej Skrabec <jernej.skrabec@...il.com>,
Samuel Holland <samuel@...lland.org>,
Thierry Reding <thierry.reding@...il.com>,
Krishna Reddy
<vdumpa@...dia.com>,
Jonathan Hunter <jonathanh@...dia.com>,
Jonathan Corbet <corbet@....net>, linux-s390@...r.kernel.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
iommu@...ts.linux.dev, asahi@...ts.linux.dev,
linux-arm-kernel@...ts.infradead.org, linux-arm-msm@...r.kernel.org,
linux-mediatek@...ts.infradead.org, linux-sunxi@...ts.linux.dev,
linux-tegra@...r.kernel.org, linux-doc@...r.kernel.org
Subject: Re: [PATCH v12 0/6] iommu/dma: s390 DMA API conversion and optimized
IOTLB flushing
On 9/27/23 11:40 AM, Jason Gunthorpe wrote:
> On Wed, Sep 27, 2023 at 05:24:20PM +0200, Niklas Schnelle wrote:
>
>> Ok, another update. On trying it out again this problem actually also
>> occurs when applying this v12 on top of v6.6-rc3 too. Also I guess
>> unlike my prior thinking it probably doesn't occur with
>> iommu.forcedac=1 since that still allows IOVAs below 4 GiB and we might
>> be the only ones who don't support those. From my point of view this
>> sounds like a mlx5_core issue they really should call
>> dma_set_mask_and_coherent() before their first call to
>> dma_alloc_coherent() not after. So I guess I'll send a v13 of this
>> series rebased on iommu/core and with an additional mlx5 patch and then
>> let's hope we can get that merged in a way that doesn't leave us with
>> broken ConnectX VFs for too long.
>
> Yes, OK. It definitely sounds wrong that mlx5 is doing dma allocations before
> setting it's dma_set_mask_and_coherent(). Please link to this thread
> and we can get Leon or Saeed to ack it for Joerg.
>
Hi Niklas,
I bisected the start of this issue to the following commit (only noticeable on s390 when you apply this subject series on top):
06cd555f73caec515a14d42ef052221fa2587ff9 ("net/mlx5: split mlx5_cmd_init() to probe and reload routines")
Which went in during the merge window. Please include with your fix and/or report to the mlx5 maintainers. Looks like the changes in this patch match what you and Jason describe; it splits up mlx5_cmd_init() and moves part of the call earlier. The net result is we first call mlx5_mdev_init>mlx5_cmd_init->alloc_cmd_page->dma_alloc_coherent and then sometime later call mlx5_pci_init->set_dma_caps->dma_set_mask_and_coherent.
Prior to this patch, we would not drive mlx5_cmd_init (and thus that first dma_alloc_coherent) until mlx5_init_one which happens _after_ mlx5_pci_init->set_dma_caps->dma_set_mask_and_coherent.
Thanks,
Matt
Powered by blists - more mailing lists