[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <0bc1a170-a643-a9d4-4b3b-2bdd2bb63759@linux.ibm.com>
Date: Mon, 6 Jul 2020 18:12:49 +0200
From: Niklas Schnelle <schnelle@...ux.ibm.com>
To: Shay Drory <shayd@...lanox.com>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Stefan Raspl <raspl@...ibm.com>
Subject: mlx5 hot unplug regression on z/VM
Hi Mr. Drory, Hi Netdev List,
I'm the PCI Subsystem maintainer for Linux on IBM Z and since v5.8-rc1
we've been seeing a regression with hot unplug of ConnectX-4 VFs
from z/VM guests. In -rc1 this still looked like a simple issue and
I wrote the following mail:
https://lkml.org/lkml/2020/6/12/376
sadly since I think -rc2 I've not been able to get this working consistently
anymore (it did work consistently with the change described above on -rc1).
In his answer Saeed Mahameed pointed me to your commits as dealing with
similar issues so I wanted to get some input on how to debug this
further.
The commands I used to test this are as follows (on a z/VM guest running
vanilla debug_defconfig v5.8-rc4 installed on Fedora 31) and you find the resulting
dmesg attached to this mail:
# vmcp q pcif // query for available PCI devices
# vmcp attach pcif <FID> to \* // where <FID> is one of the ones listed by the above command
# vmcp detach pcif <FID> // This does a hot unplug and is where things start going wrong
I guess you don't have access to hardware but I'll be happy to assist
as good as I can since digging on my own I sadly really don't know
enough about the mlx5_core driver to make more progress.
Best regards,
Niklas Schnelle
View attachment "dmesg_mlx5_detach_zvm.txt" of type "text/plain" (7690 bytes)
Powered by blists - more mailing lists