[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7b4904f5-ceb1-9409-dd79-e96abfe35382@linux.vnet.ibm.com>
Date: Mon, 7 Aug 2023 16:08:50 -0500
From: Thinh Tran <thinhtr@...ux.vnet.ibm.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: aelior@...vell.com, davem@...emloft.net, edumazet@...gle.com,
manishc@...vell.com, netdev@...r.kernel.org, pabeni@...hat.com,
skalluru@...vell.com, drc@...ux.vnet.ibm.com, abdhalee@...ibm.com,
simon.horman@...igine.com
Subject: Re: [PATCH v4] bnx2x: Fix error recovering in switch configuration
Thank you for thorough review.
On 7/31/2023 7:47 PM, Jakub Kicinski wrote:
>
> Could you split the change into two patches - one factoring out the
> code into bnx2x_stop_nic() and the other adding the nic_stopped
> variable? First one should be pure code refactoring with no functional
> changes. That'd make the reviewing process easier.
Sorry, I misunderstood comments in the reviewing of v3 asking to factor
the code.
Should I keep the changes I made, or should I summit a new patch with
factored code?
>
>> + /* Disable HW interrupts, delete NAPIs, Release IRQs */
>> + bnx2x_stop_nic(bp);
>>
>> /* Report UNLOAD_DONE to MCP */
>> bnx2x_send_unload_done(bp, false);
>> @@ -4987,6 +4983,12 @@ void bnx2x_tx_timeout(struct net_device *dev, unsigned int txqueue)
>> {
>> struct bnx2x *bp = netdev_priv(dev);
>>
>> + /* Immediately indicate link as down */
>> + bp->link_vars.link_up = 0;
>> + bp->force_link_down = true;
>> + netif_carrier_off(dev);
>> + BNX2X_ERR("Indicating link is down due to Tx-timeout\n");
>
> Is this code move to make the shutdown more immediate?
> That could also be a separate patch.
Moving the code to here has the effect of disabling the link, preventing
the excessive output of debug information from bnx2x_panic(). While
bnx2x_panic() offers valuable debugging details, the excessive dumping
overwhelms the dmesg buffer, they become unreadable.
I will exclude this part from the patch. A separate patch will be
created to improve providing debug information
>
>> /* We want the information of the dump logged,
>> * but calling bnx2x_panic() would kill all chances of recovery.
>> */
>> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
>> index d8b1824c334d..f5ecbe8d604a 100644
>> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
>> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
>> @@ -1015,6 +1015,9 @@ static inline void bnx2x_free_rx_sge_range(struct bnx2x *bp,
>> {
>> int i;
>>
>> + if (!fp->page_pool.page)
>> + return;
>> +
>> if (fp->mode == TPA_MODE_DISABLED)
>> return;
>>
>> @@ -1399,5 +1402,20 @@ void bnx2x_set_os_driver_state(struct bnx2x *bp, u32 state);
>> */
>> int bnx2x_nvram_read(struct bnx2x *bp, u32 offset, u8 *ret_buf,
>> int buf_size);
>> +static inline void bnx2x_stop_nic(struct bnx2x *bp)
>
> can't it live in bnx2x_cmn.c ?
It's in common header file for also being used by bnx2x_vfpf.c.
Why make it a static inline?
>
Just make it inlined where it is called.
>> +{
>> + if (!bp->nic_stopped) {
>> + /* Disable HW interrupts, NAPI */
>> + bnx2x_netif_stop(bp, 1);
>> + /* Delete all NAPI objects */
>> + bnx2x_del_all_napi(bp);
>> + if (CNIC_LOADED(bp))
>> + bnx2x_del_all_napi_cnic(bp);
>> + /* Release IRQs */
>> + bnx2x_free_irq(bp);
>> + bp->nic_stopped = true;
>> + }
>> +}
>> +
>>
>
> nit: double new line
>
>> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
>> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
>> @@ -529,13 +529,8 @@ void bnx2x_vfpf_close_vf(struct bnx2x *bp)
>> bnx2x_vfpf_finalize(bp, &req->first_tlv);
>>
>> free_irq:
>> - /* Disable HW interrupts, NAPI */
>> - bnx2x_netif_stop(bp, 0);
>
> This used to say
>
> bnx2x_netif_stop(bp, 0);
>
> but bnx2x_stop_nic() will do:
>
> bnx2x_netif_stop(bp, 1);
>
> is it okay to shut down the HW here ? (whatever that entails)
>
My mistake, I didn't notice this before. The second parameter is for
deciding if the hardware should stop sending interrupts or not. I'm not
familiar with the virtual function code path, but I'll correct it to
make sure things are consistent.
>> - /* Delete all NAPI objects */
>> - bnx2x_del_all_napi(bp);
>> -
>> - /* Release IRQs */
>> - bnx2x_free_irq(bp);
>> + /* Disable HW interrupts, delete NAPIs, Release IRQs */
>> + bnx2x_stop_nic(bp);
Thanks,
Thinh Tran
Powered by blists - more mailing lists