lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 14 Mar 2022 16:07:08 +0100
From:   Paul Menzel <pmenzel@...gen.mpg.de>
To:     Manish Chopra <manishc@...vell.com>
Cc:     Donald Buczek <buczek@...gen.mpg.de>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
        Ariel Elior <aelior@...vell.com>,
        Alok Prasad <palok@...vell.com>,
        Prabhakar Kushwaha <pkushwaha@...vell.com>,
        "David S. Miller" <davem@...emloft.net>,
        Greg KH <gregkh@...uxfoundation.org>, stable@...r.kernel.org,
        it+netdev@...gen.mpg.de, regressions@...ts.linux.dev
Subject: Re: [EXT] Re: [PATCH v2 net-next 1/2] bnx2x: Utilize firmware
 7.13.21.0

[Use Jakub’s current address]

Dear Manish,


Am 14.03.22 um 15:36 schrieb Donald Buczek:

> On 3/11/22 1:11 PM, Manish Chopra wrote:
>>> -----Original Message-----
>>> From: Linus Torvalds <torvalds@...ux-foundation.org>
>>> Sent: Thursday, March 10, 2022 3:48 AM

[…]

>>> On Wed, Mar 9, 2022 at 11:46 AM Manish Chopra wrote:
>>>>
>>>> This has not changed anything functionally from driver/device 
>>>> perspective,
>>> FW is still being loaded only when device is opened.
>>>> bnx2x_init_firmware() [I guess, perhaps the name is misleading] just
>>> request_firmware() to prepare the metadata to be used when device 
>>> will be opened.
>>>
>>> So how do you explain the report by Paul Menzel that things used to 
>>> work and no longer work now?
>>>
>>
>> The issue which Paul mentioned had to do with "/lib/firmware/bnx2x/* 
>> file not found" when driver probes, which was introduced by the patch 
>> in subject,
>> And the commit e13ad1443684 ("bnx2x: fix driver load from initrd") 
>> fixes this issue. So things should work as it is with the mentioned 
>> fixed commit.
>> The only discussion led by this problem now is why the 
>> request_firmware() was moved early on [from open() to probe()] by the 
>> patch in subject.
>> I explained the intention to do this in my earlier emails and let me 
>> add more details below -
>>
>> Note that we have just moved request_firmware() logic, *not* something 
>> significant which has to do with actual FW loading or device 
>> initialization from the
>> FW file data which could cause significant functional change for this 
>> device/driver, FW load/init part still stays in open flow.
>>
>> Before the patch in subject, driver used to only work with 
>> fixed/specific FW version file whose version was statically known to 
>> the driver function at probe() time to take
>> some decision to fail the function probe early in the system if the 
>> function is supposed to run with a FW version which is not the same 
>> version loaded on the device by another PF (different ENV).
>> Now when we sent this new FW patch (in subject) then we got feedback 
>> from community to maintain backward compatibility with older FW 
>> versions as well and we did it in same v2 patch legitimately,
>> just that now we can work with both older or newer FW file so we need 
>> this run time FW version information to cache (based on 
>> request_firmware() return success value for an old FW file or new FW 
>> file)
>> which will be used in follow up probe() flows to decide the function 
>> probe failure early If there could be FW version mismatches against 
>> the loaded FW on the device by other PFs already
> 
> There might be something more wrong with the patch in the subject: The 
> usability of the ports from a single card (with older firmware?) now 
> depends on the order the ports are enabled (first port enabled is 
> working, second port enabled is not working, driver complaining about a 
> firmware mismatch).
> 
> In the following examples, the driver was not built-in to the kernel but 
> loaded from the root filesystem instead, so there is no initramfs 
> related problem here.
> 
> For the records:
> 
> root@ira:~# dmesg|grep bnx2x
> [   18.749871] bnx2x 0000:45:00.0: msix capability found
> [   18.766534] bnx2x 0000:45:00.0: part number 394D4342-31373735-31314131-473331
> [   18.799198] bnx2x 0000:45:00.0: 32.000 Gb/s available PCIe bandwidth (5.0 GT/s PCIe x8 link)
> [   18.807638] bnx2x 0000:45:00.1: msix capability found
> [   18.824509] bnx2x 0000:45:00.1: part number 394D4342-31373735-31314131-473331
> [   18.857171] bnx2x 0000:45:00.1: 32.000 Gb/s available PCIe bandwidth (5.0 GT/s PCIe x8 link)
> [   18.865619] bnx2x 0000:46:00.0: msix capability found
> [   18.882636] bnx2x 0000:46:00.0: part number 394D4342-31373735-31314131-473331
> [   18.915196] bnx2x 0000:46:00.0: 32.000 Gb/s available PCIe bandwidth (5.0 GT/s PCIe x8 link)
> [   18.923636] bnx2x 0000:46:00.1: msix capability found
> [   18.940505] bnx2x 0000:46:00.1: part number 394D4342-31373735-31314131-473331
> [   18.973167] bnx2x 0000:46:00.1: 32.000 Gb/s available PCIe bandwidth (5.0 GT/s PCIe x8 link)
> [   46.480660] bnx2x 0000:45:00.0 net04: renamed from eth4
> [   46.494677] bnx2x 0000:45:00.1 net05: renamed from eth5
> [   46.508544] bnx2x 0000:46:00.0 net06: renamed from eth6
> [   46.524641] bnx2x 0000:46:00.1 net07: renamed from eth7
> root@ira:~# ls /lib/firmware/bnx2x/
> bnx2x-e1-6.0.34.0.fw   bnx2x-e1-7.13.1.0.fw   bnx2x-e1-7.8.2.0.fw     
> bnx2x-e1h-7.12.30.0.fw  bnx2x-e1h-7.8.19.0.fw  bnx2x-e2-7.10.51.0.fw  bnx2x-e2-7.8.17.0.fw
> bnx2x-e1-6.2.5.0.fw    bnx2x-e1-7.13.11.0.fw  bnx2x-e1h-6.0.34.0.fw   
> bnx2x-e1h-7.13.1.0.fw   bnx2x-e1h-7.8.2.0.fw   bnx2x-e2-7.12.30.0.fw  bnx2x-e2-7.8.19.0.fw
> bnx2x-e1-6.2.9.0.fw    bnx2x-e1-7.13.15.0.fw  bnx2x-e1h-6.2.5.0.fw    
> bnx2x-e1h-7.13.11.0.fw  bnx2x-e2-6.0.34.0.fw   bnx2x-e2-7.13.1.0.fw   bnx2x-e2-7.8.2.0.fw
> bnx2x-e1-7.0.20.0.fw   bnx2x-e1-7.13.21.0.fw  bnx2x-e1h-6.2.9.0.fw    
> bnx2x-e1h-7.13.15.0.fw  bnx2x-e2-6.2.5.0.fw    bnx2x-e2-7.13.11.0.fw
> bnx2x-e1-7.0.23.0.fw   bnx2x-e1-7.2.16.0.fw   bnx2x-e1h-7.0.20.0.fw   
> bnx2x-e1h-7.13.21.0.fw  bnx2x-e2-6.2.9.0.fw    bnx2x-e2-7.13.15.0.fw
> bnx2x-e1-7.0.29.0.fw   bnx2x-e1-7.2.51.0.fw   bnx2x-e1h-7.0.23.0.fw   
> bnx2x-e1h-7.2.16.0.fw   bnx2x-e2-7.0.20.0.fw   bnx2x-e2-7.13.21.0.fw
> bnx2x-e1-7.10.51.0.fw  bnx2x-e1-7.8.17.0.fw   bnx2x-e1h-7.0.29.0.fw   
> bnx2x-e1h-7.2.51.0.fw   bnx2x-e2-7.0.23.0.fw   bnx2x-e2-7.2.16.0.fw
> bnx2x-e1-7.12.30.0.fw  bnx2x-e1-7.8.19.0.fw   bnx2x-e1h-7.10.51.0.fw  
> bnx2x-e1h-7.8.17.0.fw   bnx2x-e2-7.0.29.0.fw   bnx2x-e2-7.2.51.0.fw
> 
> Now with v5.10.95, the first kernel of the series which includes 
> fdcfabd0952d ("bnx2x: Utilize firmware 7.13.21.0") and later:
> 
> root@ira:~# dmesg -w &
> [...]
> root@ira:~# ip link set net04 up
> [   88.504536] bnx2x 0000:45:00.0 net04: using MSI-X  IRQs: sp 47  fp[0] 49 ... fp[7] 56
> root@ira:~# ip link set net05 up
> [   90.825820] bnx2x: [bnx2x_compare_fw_ver:2380(net05)]bnx2x with FW 120d07 was already loaded which mismatches my 150d07 FW. Aborting
> RTNETLINK answers: Device or resource busy
> root@ira:~# ip link set net04 down
> root@ira:~# ip link set net05 down
> root@ira:~# ip link set net05 up
> [  114.462448] bnx2x 0000:45:00.1 net05: using MSI-X  IRQs: sp 58  fp[0] 60 ... fp[7] 67
> root@ira:~# ip link set net04 up
> [  117.247763] bnx2x: [bnx2x_compare_fw_ver:2380(net04)]bnx2x with FW 120d07 was already loaded which mismatches my 150d07 FW. Aborting
> RTNETLINK answers: Device or resource busy
> 
> With v5.10.94, both ports work fine:
> 
> root@ira:~# dmesg -w &
> [...]
> root@ira:~# ip link set net04 up
> [  133.126647] bnx2x 0000:45:00.0 net04: using MSI-X  IRQs: sp 47  fp[0] 49 ... fp[7] 56
> root@ira:~# ip link set net05 up
> [  136.215169] bnx2x 0000:45:00.1 net05: using MSI-X  IRQs: sp 58  fp[0] 60 ... fp[7] 67

One additional note, that it’s totally unclear to us, where FW version 
120d07 in the error message comes from. It maps to 7.13.18.0, which is 
nowhere to be found and too new to be on the cards EEPROM, which should 
be from 2013 or so.


Kind regards,

Paul


>> So we need to understand why we should not call request_firmware() in 
>> probe or at least what's really harmful in doing that in probe() if 
>> some of the follow up probe flows needs
>> some of the metadata info (like the run time FW versions info in this 
>> case which we get based on request_firmware() return value), we could 
>> avoid this but we don't want
>> to add some ugly/unsuitable file APIs checks to know which FW version 
>> file is available on the file system if there is already an API 
>> request_firmware() available for this to be used.
>>
>> Please let us know. Thanks.
>>
>>> You can't do request_firmware() early. When you actually then push the
>>> firmware to the device is immaterial - but request_firmware() has to 
>>> be done
>>> after the system is up and running.
>>>
>>>                   Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ