lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7b4904f5-ceb1-9409-dd79-e96abfe35382@linux.vnet.ibm.com>
Date: Mon, 7 Aug 2023 16:08:50 -0500
From: Thinh Tran <thinhtr@...ux.vnet.ibm.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: aelior@...vell.com, davem@...emloft.net, edumazet@...gle.com,
        manishc@...vell.com, netdev@...r.kernel.org, pabeni@...hat.com,
        skalluru@...vell.com, drc@...ux.vnet.ibm.com, abdhalee@...ibm.com,
        simon.horman@...igine.com
Subject: Re: [PATCH v4] bnx2x: Fix error recovering in switch configuration

Thank you for thorough review.

On 7/31/2023 7:47 PM, Jakub Kicinski wrote:

> 
> Could you split the change into two patches - one factoring out the
> code into bnx2x_stop_nic() and the other adding the nic_stopped
> variable? First one should be pure code refactoring with no functional
> changes. That'd make the reviewing process easier.

Sorry, I misunderstood comments in the reviewing of v3 asking to factor 
the code.
Should I keep the changes I made, or should I summit a new patch with 
factored code?

> 
>> +		/* Disable HW interrupts, delete NAPIs, Release IRQs */
>> +		bnx2x_stop_nic(bp);
>>   
>>   		/* Report UNLOAD_DONE to MCP */
>>   		bnx2x_send_unload_done(bp, false);
>> @@ -4987,6 +4983,12 @@ void bnx2x_tx_timeout(struct net_device *dev, unsigned int txqueue)
>>   {
>>   	struct bnx2x *bp = netdev_priv(dev);
>>   
>> +	/* Immediately indicate link as down */
>> +	bp->link_vars.link_up = 0;
>> +	bp->force_link_down = true;
>> +	netif_carrier_off(dev);
>> +	BNX2X_ERR("Indicating link is down due to Tx-timeout\n");
> 
> Is this code move to make the shutdown more immediate?
> That could also be a separate patch.

Moving the code to here has the effect of disabling the link, preventing 
the excessive output of debug information from bnx2x_panic(). While 
bnx2x_panic() offers valuable debugging details, the excessive dumping 
overwhelms the dmesg buffer, they become unreadable.
I will exclude this part from the patch. A separate patch will be 
created to improve providing debug information

> 
>>   	/* We want the information of the dump logged,
>>   	 * but calling bnx2x_panic() would kill all chances of recovery.
>>   	 */
>> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
>> index d8b1824c334d..f5ecbe8d604a 100644
>> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
>> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
>> @@ -1015,6 +1015,9 @@ static inline void bnx2x_free_rx_sge_range(struct bnx2x *bp,
>>   {
>>   	int i;
>>   
>> +	if (!fp->page_pool.page)
>> +		return;
>> +
>>   	if (fp->mode == TPA_MODE_DISABLED)
>>   		return;
>>   
>> @@ -1399,5 +1402,20 @@ void bnx2x_set_os_driver_state(struct bnx2x *bp, u32 state);
>>    */
>>   int bnx2x_nvram_read(struct bnx2x *bp, u32 offset, u8 *ret_buf,
>>   		     int buf_size);
>> +static inline void bnx2x_stop_nic(struct bnx2x *bp)
> 
> can't it live in bnx2x_cmn.c ?
It's in common header file for also being used by bnx2x_vfpf.c.

  Why make it a static inline?
> 
Just make it inlined where it is called.
>> +{
>> +	if (!bp->nic_stopped) {
>> +		/* Disable HW interrupts, NAPI */
>> +		bnx2x_netif_stop(bp, 1);
>> +		/* Delete all NAPI objects */
>> +		bnx2x_del_all_napi(bp);
>> +		if (CNIC_LOADED(bp))
>> +			bnx2x_del_all_napi_cnic(bp);
>> +		/* Release IRQs */
>> +		bnx2x_free_irq(bp);
>> +		bp->nic_stopped = true;
>> +	}
>> +}
>> +
>>   
> 
> nit: double new line
> 
>> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
>> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
>> @@ -529,13 +529,8 @@ void bnx2x_vfpf_close_vf(struct bnx2x *bp)
>>   	bnx2x_vfpf_finalize(bp, &req->first_tlv);
>>   
>>   free_irq:
>> -	/* Disable HW interrupts, NAPI */
>> -	bnx2x_netif_stop(bp, 0);
> 
> This used to say
> 
> 	bnx2x_netif_stop(bp, 0);
> 
> but bnx2x_stop_nic() will do:
> 
> 	bnx2x_netif_stop(bp, 1);
> 
> is it okay to shut down the HW here ? (whatever that entails)
> 
My mistake, I didn't notice this before. The second parameter is for 
deciding if the hardware should stop sending interrupts or not. I'm not 
familiar with the virtual function code path, but I'll correct it to 
make sure things are consistent.

>> -	/* Delete all NAPI objects */
>> -	bnx2x_del_all_napi(bp);
>> -
>> -	/* Release IRQs */
>> -	bnx2x_free_irq(bp);
>> +	/* Disable HW interrupts, delete NAPIs, Release IRQs */
>> +	bnx2x_stop_nic(bp);

Thanks,
Thinh Tran

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ