netdev - Re: [PATCH 1/1] be2net: Detach interface for avoiding a system crash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a6b9bbe1-3f3c-b798-56ab-8193d481e364@suse.com>
Date:   Wed, 24 Apr 2019 13:53:58 +0800
From:   Firo <fyang@...e.com>
To:     saeedm@...lanox.com, sathya.perla@...adcom.com,
        sriharsha.basavapatna@...adcom.com, somnath.kotur@...adcom.com,
        ajit.khaparde@...adcom.com, davem@...emloft.net,
        Firo Yang <firogm@...il.com>
Cc:     netdev@...r.kernel.org
Subject: Re: [PATCH 1/1] be2net: Detach interface for avoiding a system crash



On 4/20/19 6:31 AM, Saeed Mahameed wrote:
> On Fri, 2019-04-19 at 21:07 +0800, Firo wrote:
>>
>> On 4/19/19 2:17 AM, Saeed Mahameed wrote:
>>> On Thu, 2019-04-18 at 15:05 +0800, Firo wrote:
>>>> On 4/2/19 12:25 AM, Saeed Mahameed wrote:
>>>>> On Mon, 2019-04-01 at 20:24 +0800, Firo Yang wrote:
>>>>>> This crash is triggered by a user-after-free since lake of
>>>>>> the synchronization of a race condition between   
>>>>>> be_update_queues() modifying multi-purpose channels of
>>>>>> network device and be_tx_timeout().
>>>>>>
>>>>>> BUG: unable to handle kernel NULL pointer dereference at
>>>>>> (null)
>>>>>> Call Trace:
>>>>>> be_tx_timeout+0xa5/0x360 [be2net]
>>>>>> dev_watchdog+0x1d8/0x210
>>>>>> call_timer_fn+0x32/0x140
>>>>>>
>>>>>> To fix it, detach the interface before modifying
>>>>>> multi-purpose channels of network device.
>>>>>>
>>>>>> Signed-off-by: Firo Yang <fyang@...e.com>
>>>>>> ---
>>>>>>  drivers/net/ethernet/emulex/benet/be_main.c | 12 ++++++++---
>>>>>> -
>>>>>>  1 file changed, 8 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/net/ethernet/emulex/benet/be_main.c
>>>>>> b/drivers/net/ethernet/emulex/benet/be_main.c
>>>>>> index d5026909dec5..25d0128bf684 100644
>>>>>> --- a/drivers/net/ethernet/emulex/benet/be_main.c
>>>>>> +++ b/drivers/net/ethernet/emulex/benet/be_main.c
>>>>>> @@ -4705,6 +4705,8 @@ int be_update_queues(struct be_adapter
>>>>>> *adapter)
>>>>>>  	struct net_device *netdev = adapter->netdev;
>>>>>>  	int status;
>>>>>>    
>>>>>> +	netif_device_detach(netdev);
>>>>>> +
>>>>>
>>>>> This will reduce the probability, but will not do the trick.
>>>>> since this will not guarantee that the dev_watchdog is
>>>>> disabled.
>>>> Hi Saeed,
>>>>
>>>> What about using dev_watchdog_down/up() to temporarily disable
>>>> the
>>>> dev_watchdog?
>>>>
>>>> +++ b/drivers/net/ethernet/emulex/benet/be_main.c
>>>> @@ -4697,6 +4697,8 @@ int be_update_queues(struct be_adapter
>>>> *adapter)
>>>>         struct net_device *netdev = adapter->netdev;
>>>>         int status;
>>>>
>>>> +       dev_watchdog_down();
>>>> +
>>>   
>>> there is no such API, currently this is a static function, and i
>>> don't
>>> think it is a good idea to mess around with the watchdog.
>>>   
>>> If you want to avoid deferred work and explicit locking, you need
>>> something similar to what you did with the device_detach to flag to
>>> the
>>> watchdog to not look at your tx queues.
>>>   
>>> by looking at be_close() I see that it calls
>>> netif_tx_disable(netdev);
>>> which provides some kind of state synchronization with the
>>> watchdog.
>>>   
>>> so maybe netif_device_detach() will work, but it is a very heavy
>>> gun!
>>
>> Sorry, I cannot fully understand why you think netif_device_detach()
>> is
>> very heavy. Except clearing __LINK_STATE_PRESENT,
>> netif_device_detach()
>> almost does the same thing as  netif_tx_disable(); could you please
>> detail your thought?
>>
> 
> calling netif_tx_disable on its own is not enough to make the watchdog
> back-off, you need the extra flag from carrier_off or detach_device.
I think your first comment is right. Locking mechanism is necessary
here. Consider the following situation. In dev_watchdog(), due to some
reasons[1], netif_xmit_stopped(txq) && time_after(jiffies, (trans_start
+ dev->watchdog_timeo)) is true; and dev_watchdog() tests the outermost
if-statement[2] just before be_update_queues() executes carrier_off or
detach_device; then dev_watchdog() will be possible to hit a kernel oops.

[1]: For example, a real problem causes that NIC stopped and timed out.
[2]: if (!qdisc_tx_is_noop(dev)) {
                if (netif_device_present(dev) &&
                    netif_running(dev) &&
                    netif_carrier_ok(dev))
> 
> 1) the documentation of netif_device_detach says 
> "Mark device as removed from system and therefore no longer available."
> which is not true in your case.
Indeed.
> 
> 2) netif_device_present(dev) check is used very widely in netdev
> control paths, so if you detach the device while doing one config you
> will loose all other configs and they would wrongfully return -ENODEV;
I went through a few *ioctl and netlink code; it seems that they all
under the protection of RTNL lock and won't be able to return -ENODEV.

> 
> calling carrier_off will make the watch-dog back off but still allow
> other configurations.
> 
> the carrier is used to mark that the device is not available for tx,
> which is exactly your case since you are resetting the rings, but still
> available for anything else.
> 
> looking at other drivers, i couldn't find anyone using the
> detach_device approach to disable the watchdog while resetting the
> rings, but almost most of the drivers are doing the carrier_off.
> 
>>> netif_carrier_off(netdev) should do the work, and at the end you
>>> will
>>> need to restore the original carrier state.
>>
>> netif_carrier_off() might trigger a linkwatch event. From this point
>> of
>> view, maybe netif_device_detach() is better.
>>
> 
> Code-wise yes netif_device_detach is better, but the implications can
> be worse, as explained above.
> 
> My 2 cent is to copy the approach used by most drivers, and just use
> netif_carrier_off(), and maybe in the future we will introduce a more
> relaxed version of carrier off to shut-up the watchdog, and fix all
> drivers at once :).>
> I will leave the decision to you and to this driver maintainers.
> 
> Thanks for following up on my comments.
Thank you for such detailed explanation!
// Firo

> 
>> Thanks,
>> Firo
>>
>>>   
>>> carrier_ok = netif_carrier_ok(netdev);
>>> /* must be called before netif_tx_disable() */
>>> netif_carrier_off(netdev);   
>>>   
>>> // do stuff
>>>   
>>> if (carrier_ok)
>>> 	netif_carrier_on(netdev);
>>>   
>>>>         if (netif_running(netdev))
>>>>                 be_close(netdev);
>>>>
>>>> @@ -4711,21 +4713,21 @@ int be_update_queues(struct be_adapter
>>>> *adapter)
>>>>         be_clear_queues(adapter);
>>>>         status = be_cmd_if_destroy(adapter, adapter-
>>>>> if_handle,  0);
>>>>         if (status)
>>>> -               return status;
>>>> +               goto out;
>>>>
>>>>         if (!msix_enabled(adapter)) {
>>>>                 status = be_msix_enable(adapter);
>>>>                 if (status)
>>>> -                       return status;
>>>> +                       goto out;
>>>>         }
>>>>
>>>>         status = be_if_create(adapter);
>>>>         if (status)
>>>> -               return status;
>>>> +               goto out;
>>>>
>>>>         status = be_setup_queues(adapter);
>>>>         if (status)
>>>> -               return status;
>>>> +               goto out;
>>>>
>>>>         be_schedule_worker(adapter);
>>>>
>>>> @@ -4741,6 +4743,9 @@ int be_update_queues(struct be_adapter
>>>> *adapter)
>>>>         if (netif_running(netdev))
>>>>                 status = be_open(netdev);
>>>>
>>>> +out:
>>>> +       dev_watchdog_up();
>>>>
>>>> Thanks,
>>>> Firo
>>>>
>>>>> what you need is proper locking mechanism and/or scheduling the
>>>>> tx_timeout handling out of atomic context if a mutex will be
>>>>> required.
>>>>>
>>>>> netif_device_detach is a too heavy hammer for such
>>>>> synchronizations
>>>>> tasks.
>>>>>
>>>>>>  	if (netif_running(netdev))
>>>>>>  		be_close(netdev);
>>>>>>    
>>>>>> @@ -4719,21 +4721,21 @@ int be_update_queues(struct
>>>>>> be_adapter
>>>>>> *adapter)
>>>>>>  	be_clear_queues(adapter);
>>>>>>  	status = be_cmd_if_destroy(adapter, adapter-
>>>>>>> if_handle,  0);
>>>>>>  	if (status)
>>>>>> -		return status;
>>>>>> +		goto out;
>>>>>>    
>>>>>>  	if (!msix_enabled(adapter)) {
>>>>>>  		status = be_msix_enable(adapter);
>>>>>>  		if (status)
>>>>>> -			return status;
>>>>>> +			goto out;
>>>>>>  	}
>>>>>>    
>>>>>>  	status = be_if_create(adapter);
>>>>>>  	if (status)
>>>>>> -		return status;
>>>>>> +		goto out;
>>>>>>    
>>>>>>  	status = be_setup_queues(adapter);
>>>>>>  	if (status)
>>>>>> -		return status;
>>>>>> +		goto out;
>>>>>>    
>>>>>>  	be_schedule_worker(adapter);
>>>>>>    
>>>>>> @@ -4748,6 +4750,8 @@ int be_update_queues(struct be_adapter
>>>>>> *adapter)
>>>>>>  	if (netif_running(netdev))
>>>>>>  		status = be_open(netdev);
>>>>>>    
>>>>>> +out:
>>>>>> +	netif_device_attach(netdev);
>>>>>>  	return status;
>>>>>>  }
>>>>>>    
>>
>>