[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20241205091705.GW1245331@unreal>
Date: Thu, 5 Dec 2024 11:17:05 +0200
From: Leon Romanovsky <leon@...nel.org>
To: Francesco Poli <invernomuto@...anoici.org>
Cc: Uwe Kleine-König <ukleinek@...ian.org>,
1086520-done@...s.debian.org, Mark Zhang <markzhang@...dia.com>,
linux-rdma@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: Bug#1086520: linux-image-6.11.2-amd64: makes opensm fail to start
On Wed, Dec 04, 2024 at 06:13:56PM +0100, Francesco Poli wrote:
> On Wed, 4 Dec 2024 17:37:05 +0100 Uwe Kleine-König wrote:
>
> > Hello Francesco,
>
> Hello Uwe,
>
> [...]
> > I wonder if you could test a firmware upgrade or the above patch. Would
> > be nice to know if there are still some things to do for us (= Debian
> > kernel team) here.
>
> Yes, I've finally got around to upgrading the firmware.
>
> And today I had a time window, where I could reboot the cluster head
> node.
> After the reboot, the InfiniBand network works correctly:
>
> $ uname -v
> #1 SMP PREEMPT_DYNAMIC Debian 6.11.10-1 (2024-11-23)
> $ ls -altrF /sys/class/infiniband_mad/
> total 0
> lrwxrwxrwx 1 root root 0 Dec 4 10:15 umad0 -> ../../devices/pci0000:80/0000:80:01.1/0000:81:00.0/infiniband_mad/umad0/
> lrwxrwxrwx 1 root root 0 Dec 4 10:15 umad1 -> ../../devices/pci0000:80/0000:80:01.1/0000:81:00.1/infiniband_mad/umad1/
> drwxr-xr-x 2 root root 0 Dec 4 10:17 ./
> drwxr-xr-x 73 root root 0 Dec 4 10:17 ../
> -r--r--r-- 1 root root 4096 Dec 4 10:17 abi_version
> lrwxrwxrwx 1 root root 0 Dec 4 18:08 issm1 -> ../../devices/pci0000:80/0000:80:01.1/0000:81:00.1/infiniband_mad/issm1/
> lrwxrwxrwx 1 root root 0 Dec 4 18:08 issm0 -> ../../devices/pci0000:80/0000:80:01.1/0000:81:00.0/infiniband_mad/issm0/
> # ethtool -i ibp129s0f0
> driver: mlx5_core[ib_ipoib]
> version: 6.11.10-amd64
> firmware-version: 20.43.1014 (MT_0000000224)
> expansion-rom-version:
> bus-info: 0000:81:00.0
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: no
> supports-register-dump: no
> supports-priv-flags: yes
> # ethtool -i ibp129s0f1
> driver: mlx5_core[ib_ipoib]
> version: 6.11.10-amd64
> firmware-version: 20.43.1014 (MT_0000000224)
> expansion-rom-version:
> bus-info: 0000:81:00.1
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: no
> supports-register-dump: no
> supports-priv-flags: yes
> $ ps aux | grep opens[m]
> root 1150 0.0 0.0 1560776 3636 ? Ssl 10:15 0:00 /usr/sbin/opensm --guid 0x9c63c00300033240 --log_file /var/log/opensm.0x9c63c00300033240.log
>
>
> >
> > If everything is fine for you, I'd like to close this bug.
>
> I am closing the Debian bug report right now.
> Thanks to everyone who has been involved for the great and kind help!
Thanks a lot for your help. You helped a lot.
BTW, we have an official fix [1], but it wasn't sent yet as we want to
finish all various tests first (E2E, QA e.t.c).
[1] https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/commit/?h=rdma-next&id=09754c1e5d0d204747928290cc8c6f4371fd4c6a
>
> >
> > Best regards
>
> Have a nice evening. :-)
>
> --
> http://www.inventati.org/frx/
> There's not a second to spare! To the laboratory!
> ..................................................... Francesco Poli .
> GnuPG key fpr == CA01 1147 9CD2 EFDF FB82 3925 3E1C 27E1 1F69 BFFE
Powered by blists - more mailing lists