[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BY3PR18MB4612DEDE441A89EEE0470850AB119@BY3PR18MB4612.namprd18.prod.outlook.com>
Date: Wed, 16 Mar 2022 18:25:55 +0000
From: Manish Chopra <manishc@...vell.com>
To: Paul Menzel <pmenzel@...gen.mpg.de>
CC: "buczek@...gen.mpg.de" <buczek@...gen.mpg.de>,
"kuba@...nel.org" <kuba@...nel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Ariel Elior <aelior@...vell.com>,
"it+netdev@...gen.mpg.de" <it+netdev@...gen.mpg.de>,
"regressions@...ts.linux.dev" <regressions@...ts.linux.dev>
Subject: RE: [RFC net] bnx2x: fix built-in kernel driver load failure
Hi Paul,
> -----Original Message-----
> From: Paul Menzel <pmenzel@...gen.mpg.de>
> Sent: Wednesday, March 16, 2022 10:40 PM
> To: Manish Chopra <manishc@...vell.com>
> Cc: buczek@...gen.mpg.de; kuba@...nel.org; netdev@...r.kernel.org; Ariel
> Elior <aelior@...vell.com>; it+netdev@...gen.mpg.de;
> regressions@...ts.linux.dev
> Subject: Re: [RFC net] bnx2x: fix built-in kernel driver load failure
>
> Dear Manish,
>
>
> Thank you for the patch.
>
> Am 16.03.22 um 12:18 schrieb Manish Chopra:
> > commit b7a49f73059f ("bnx2x: Utilize firmware 7.13.21.0") added
> > request_firmware() logic in probe() which caused built-in kernel
> > driver load failure as access to firmware file is not feasible during
> > the probe time.
>
> … for example, when the initrd does not provide the firmware files.
>
> Please also paste one example error message.
>
> > This patch fixes this issue by -
> >
> > 1. Removing request_firmware() logic from the probe() such
> > that open() handle it as it used to handle it earlier
> >
> > 2. Relaxing a bit FW version comparisons against the loaded FW
> > (to allow many close/compatible FWs to run together now)
>
> I’d prefer if you also pasted one error message, and even split this out into a
> separate commit with elaborate problem description.
Both needs to go in same commit, as we had to relax the FW versions comparisons in probe now to have
request_firmware() in open(), as at the probe time we don't know which FW file driver will be working with.
>
> Style note: For the commit message, it’d be great if you used 75 characters
> per line.
>
> > Reported-by: Paul Menzel <pmenzel@...gen.mpg.de>
> > Fixes: b7a49f73059f ("bnx2x: Utilize firmware 7.13.21.0")
>
> The regzbot also asks to add the tag below [1].
>
> Link:
> https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__lore.kernel.org_r_46f2d9d9-2Dae7f-2Db332-2Dddeb-2Db59802be2bab-
> 40molgen.mpg.de&d=DwIDaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=HndLdONH
> 2rCgfn5wuOHYh1x9I-
> 8ZUgYJw5n1lg98JVY&m=Vmw_yGAK292FGGYP9HzkbSg7C3Nn9TJQ8vXwz2Ah
> Q-v8_d0NQO2ilbGEcDDZhaxy&s=__nSwZvf_xVM_EMA-
> 7PVSX3L3X7XS4Bhz330B_AkZEQ&e=
>
> > Signed-off-by: Manish Chopra <manishc@...vell.com>
> > Signed-off-by: Ariel Elior <aelior@...vell.com>
> > ---
> >
> > Note that this patch is just for test and get feedback from Paul
> > Menzel about the issue reported by him on built-in driver probe
> > failure due to firmware files not found
>
> Tested-by: Paul Menzel <pmenzel@...gen.mpg.de>
>
> Dell PowerEdge R910/0KYD3D, BIOS 2.10.0 08/29/2013 with patch on top of
> Linux 5.10.103 with 7.13.15.0 on the root partition:
>
> $ lspci -nn -s 45:00.1
> 45:00.1 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme
> II BCM57711 10-Gigabit PCIe [14e4:164f] $ ethtool -i net05
> driver: bnx2x
> version: 5.10.103.mx64.429-00016-g597b02
> firmware-version: 7.8.16 bc 6.2.26 phy aa0.406
> expansion-rom-version:
> bus-info: 0000:45:00.1
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: yes
> ```
>
> > drivers/net/ethernet/broadcom/bnx2x/bnx2x.h | 2 --
> > .../net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 28 +++++++++++--------
> > .../net/ethernet/broadcom/bnx2x/bnx2x_main.c | 15 ++--------
> > 3 files changed, 19 insertions(+), 26 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
> > b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
> > index a19dd6797070..2209d99b3404 100644
> > --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
> > +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
> > @@ -2533,6 +2533,4 @@ void bnx2x_register_phc(struct bnx2x *bp);
> > * Meant for implicit re-load flows.
> > */
> > int bnx2x_vlan_reconfigure_vid(struct bnx2x *bp); -int
> > bnx2x_init_firmware(struct bnx2x *bp); -void
> > bnx2x_release_firmware(struct bnx2x *bp);
> > #endif /* bnx2x.h */
> > diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> > b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> > index 8d36ebbf08e1..5729a5ab059d 100644
> > --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> > +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
> > @@ -2364,24 +2364,30 @@ int bnx2x_compare_fw_ver(struct bnx2x *bp,
> u32 load_code, bool print_err)
> > /* is another pf loaded on this engine? */
> > if (load_code != FW_MSG_CODE_DRV_LOAD_COMMON_CHIP &&
> > load_code != FW_MSG_CODE_DRV_LOAD_COMMON) {
> > - /* build my FW version dword */
> > - u32 my_fw = (bp->fw_major) + (bp->fw_minor << 8) +
> > - (bp->fw_rev << 16) + (bp->fw_eng << 24);
> > + u8 loaded_fw_major, loaded_fw_minor, loaded_fw_rev,
> loaded_fw_eng;
> > + u32 loaded_fw;
> >
> > /* read loaded FW from chip */
> > - u32 loaded_fw = REG_RD(bp, XSEM_REG_PRAM);
> > + loaded_fw = REG_RD(bp, XSEM_REG_PRAM);
> >
> > - DP(BNX2X_MSG_SP, "loaded fw %x, my fw %x\n",
> > - loaded_fw, my_fw);
> > + loaded_fw_major = loaded_fw & 0xff;
> > + loaded_fw_minor = (loaded_fw >> 8) & 0xff;
> > + loaded_fw_rev = (loaded_fw >> 16) & 0xff;
> > + loaded_fw_eng = (loaded_fw >> 24) & 0xff;
> > +
> > + DP(BNX2X_MSG_SP, "loaded fw 0x%x major 0x%x minor 0x%x
> rev 0x%x eng 0x%x\n",
> > + loaded_fw, loaded_fw_major, loaded_fw_minor,
> loaded_fw_rev,
> > +loaded_fw_eng);
>
> Hmm, with `CONFIG_BNX2X=y` and `bnx2x.debug=0x0100000`, bringing up
> net05 (.1) and then net04 (.0), I only see:
>
> [ 3333.883697] bnx2x: [bnx2x_compare_fw_ver:2378(net04)]loaded fw
> f0d07 major 7 minor d rev f eng 0
>
I think this print is not good probably (that's why it is default disabled), it’s not really the firmware driver
is supposed to work with (it is something which was already loaded by any other PF somewhere or some residue from earlier loads),
driver is always going to work with the firmware it gets from request_firmware(). I suggest you to enable below prints
to know about which FW driver is going to work with. Perhaps, I will enable below default.
BNX2X_DEV_INFO("Loading %s\n", fw_file_name);
rc = request_firmware(&bp->firmware, fw_file_name, &bp->pdev->dev);
if (rc) {
BNX2X_DEV_INFO("Trying to load older fw %s\n", fw_file_name_v15);
> For another patch, but the currently loaded firmware, and when loading new
> firmware, the version of it, should also be logged by Linux (by default, and not
> with debug level).
>
> Also copying the 7.13.21.0 firmware on the running system, bringing the
> interfaces down and up again, the newer firmware is not loaded, but it stays
> with the 7.13.15.0:
>
> [ 3533.374046] bnx2x: [bnx2x_compare_fw_ver:2378(net04)]loaded fw
> f0d07 major 7 minor d rev f eng 0
>
> > /* abort nic load if version mismatch */
> > - if (my_fw != loaded_fw) {
> > + if (loaded_fw_major != BCM_5710_FW_MAJOR_VERSION ||
> > + loaded_fw_minor != BCM_5710_FW_MINOR_VERSION ||
> > + loaded_fw_eng != BCM_5710_FW_ENGINEERING_VERSION
> ||
> > + loaded_fw_rev < BCM_5710_FW_REVISION_VERSION_V15)
> {
> > if (print_err)
> > - BNX2X_ERR("bnx2x with FW %x was already
> loaded which mismatches my %x FW. Aborting\n",
> > - loaded_fw, my_fw);
> > + BNX2X_ERR("loaded FW incompatible.
> Aborting\n");
> > else
> > - BNX2X_DEV_INFO("bnx2x with FW %x was
> already loaded which mismatches my %x FW, possibly due to MF UNDI\n",
> > - loaded_fw, my_fw);
> > + BNX2X_DEV_INFO("loaded FW incompatible,
> possibly due to MF
> > +UNDI\n");
> > +
> > return -EBUSY;
> > }
> > }
> > diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> > b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> > index eedb48d945ed..c19b072f3a23 100644
> > --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> > +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> > @@ -12319,15 +12319,6 @@ static int bnx2x_init_bp(struct bnx2x *bp)
> >
> > bnx2x_read_fwinfo(bp);
> >
> > - if (IS_PF(bp)) {
> > - rc = bnx2x_init_firmware(bp);
> > -
> > - if (rc) {
> > - bnx2x_free_mem_bp(bp);
> > - return rc;
> > - }
> > - }
> > -
> > func = BP_FUNC(bp);
> >
> > /* need to reset chip if undi was active */ @@ -12340,7 +12331,6 @@
> > static int bnx2x_init_bp(struct bnx2x *bp)
> >
> > rc = bnx2x_prev_unload(bp);
> > if (rc) {
> > - bnx2x_release_firmware(bp);
> > bnx2x_free_mem_bp(bp);
> > return rc;
> > }
> > @@ -13409,7 +13399,7 @@ do {
> \
> > (u8 *)bp->arr, len); \
> > } while (0)
> >
> > -int bnx2x_init_firmware(struct bnx2x *bp)
> > +static int bnx2x_init_firmware(struct bnx2x *bp)
> > {
> > const char *fw_file_name, *fw_file_name_v15;
> > struct bnx2x_fw_file_hdr *fw_hdr;
> > @@ -13509,7 +13499,7 @@ int bnx2x_init_firmware(struct bnx2x *bp)
> > return rc;
> > }
> >
> > -void bnx2x_release_firmware(struct bnx2x *bp)
> > +static void bnx2x_release_firmware(struct bnx2x *bp)
> > {
> > kfree(bp->init_ops_offsets);
> > kfree(bp->init_ops);
> > @@ -14026,7 +14016,6 @@ static int bnx2x_init_one(struct pci_dev *pdev,
> > return 0;
> >
> > init_one_freemem:
> > - bnx2x_release_firmware(bp);
> > bnx2x_free_mem_bp(bp);
> >
> > init_one_exit:
> > --
> > 2.35.1.273.ge6ebfd0
>
> So why was the earlier firmware version comparison needed in commit
> b7a49f73059f ("bnx2x: Utilize firmware 7.13.21.0")?
>
FW version comparison was always there (even before this commit). Earlier it used to compare fixed FW (7.13.15.0) against the already loaded FW,
Now that driver can work with different FW, so we have to be relaxed with a few compatible FW versions as we don't know which FW driver is going to
work through request_firmware(). The comparison against the already loaded_fw (maybe by any other/earlier PF on the device) is there to decide
for a given PF to come up or not with the FW it will be requesting through request_firmware() later in open().
[Just consider the scenario for multiple PFs running in different environments with different firmwares].
> I let the maintainers decide how to best go forward.
>
>
> Kind regards,
>
> Paul
>
>
> [1]: https://urldefense.proofpoint.com/v2/url?u=https-3A__linux-
> 2Dregtracking.leemhuis.info_regzbot_mainline_&d=DwIDaQ&c=nKjWec2b6R
> 0mOyPaz7xtfQ&r=HndLdONH2rCgfn5wuOHYh1x9I-
> 8ZUgYJw5n1lg98JVY&m=Vmw_yGAK292FGGYP9HzkbSg7C3Nn9TJQ8vXwz2Ah
> Q-
> v8_d0NQO2ilbGEcDDZhaxy&s=zdaRoa2c6t8bnXBbqwYyQ8Dq7ewsnmMefHM
> oaejYuBU&e=
> (click on the array to expand the information)
Powered by blists - more mailing lists