[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a6b9bbe1-3f3c-b798-56ab-8193d481e364@suse.com>
Date: Wed, 24 Apr 2019 13:53:58 +0800
From: Firo <fyang@...e.com>
To: saeedm@...lanox.com, sathya.perla@...adcom.com,
sriharsha.basavapatna@...adcom.com, somnath.kotur@...adcom.com,
ajit.khaparde@...adcom.com, davem@...emloft.net,
Firo Yang <firogm@...il.com>
Cc: netdev@...r.kernel.org
Subject: Re: [PATCH 1/1] be2net: Detach interface for avoiding a system crash
On 4/20/19 6:31 AM, Saeed Mahameed wrote:
> On Fri, 2019-04-19 at 21:07 +0800, Firo wrote:
>>
>> On 4/19/19 2:17 AM, Saeed Mahameed wrote:
>>> On Thu, 2019-04-18 at 15:05 +0800, Firo wrote:
>>>> On 4/2/19 12:25 AM, Saeed Mahameed wrote:
>>>>> On Mon, 2019-04-01 at 20:24 +0800, Firo Yang wrote:
>>>>>> This crash is triggered by a user-after-free since lake of
>>>>>> the synchronization of a race condition between
>>>>>> be_update_queues() modifying multi-purpose channels of
>>>>>> network device and be_tx_timeout().
>>>>>>
>>>>>> BUG: unable to handle kernel NULL pointer dereference at
>>>>>> (null)
>>>>>> Call Trace:
>>>>>> be_tx_timeout+0xa5/0x360 [be2net]
>>>>>> dev_watchdog+0x1d8/0x210
>>>>>> call_timer_fn+0x32/0x140
>>>>>>
>>>>>> To fix it, detach the interface before modifying
>>>>>> multi-purpose channels of network device.
>>>>>>
>>>>>> Signed-off-by: Firo Yang <fyang@...e.com>
>>>>>> ---
>>>>>> drivers/net/ethernet/emulex/benet/be_main.c | 12 ++++++++---
>>>>>> -
>>>>>> 1 file changed, 8 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/net/ethernet/emulex/benet/be_main.c
>>>>>> b/drivers/net/ethernet/emulex/benet/be_main.c
>>>>>> index d5026909dec5..25d0128bf684 100644
>>>>>> --- a/drivers/net/ethernet/emulex/benet/be_main.c
>>>>>> +++ b/drivers/net/ethernet/emulex/benet/be_main.c
>>>>>> @@ -4705,6 +4705,8 @@ int be_update_queues(struct be_adapter
>>>>>> *adapter)
>>>>>> struct net_device *netdev = adapter->netdev;
>>>>>> int status;
>>>>>>
>>>>>> + netif_device_detach(netdev);
>>>>>> +
>>>>>
>>>>> This will reduce the probability, but will not do the trick.
>>>>> since this will not guarantee that the dev_watchdog is
>>>>> disabled.
>>>> Hi Saeed,
>>>>
>>>> What about using dev_watchdog_down/up() to temporarily disable
>>>> the
>>>> dev_watchdog?
>>>>
>>>> +++ b/drivers/net/ethernet/emulex/benet/be_main.c
>>>> @@ -4697,6 +4697,8 @@ int be_update_queues(struct be_adapter
>>>> *adapter)
>>>> struct net_device *netdev = adapter->netdev;
>>>> int status;
>>>>
>>>> + dev_watchdog_down();
>>>> +
>>>
>>> there is no such API, currently this is a static function, and i
>>> don't
>>> think it is a good idea to mess around with the watchdog.
>>>
>>> If you want to avoid deferred work and explicit locking, you need
>>> something similar to what you did with the device_detach to flag to
>>> the
>>> watchdog to not look at your tx queues.
>>>
>>> by looking at be_close() I see that it calls
>>> netif_tx_disable(netdev);
>>> which provides some kind of state synchronization with the
>>> watchdog.
>>>
>>> so maybe netif_device_detach() will work, but it is a very heavy
>>> gun!
>>
>> Sorry, I cannot fully understand why you think netif_device_detach()
>> is
>> very heavy. Except clearing __LINK_STATE_PRESENT,
>> netif_device_detach()
>> almost does the same thing as netif_tx_disable(); could you please
>> detail your thought?
>>
>
> calling netif_tx_disable on its own is not enough to make the watchdog
> back-off, you need the extra flag from carrier_off or detach_device.
I think your first comment is right. Locking mechanism is necessary
here. Consider the following situation. In dev_watchdog(), due to some
reasons[1], netif_xmit_stopped(txq) && time_after(jiffies, (trans_start
+ dev->watchdog_timeo)) is true; and dev_watchdog() tests the outermost
if-statement[2] just before be_update_queues() executes carrier_off or
detach_device; then dev_watchdog() will be possible to hit a kernel oops.
[1]: For example, a real problem causes that NIC stopped and timed out.
[2]: if (!qdisc_tx_is_noop(dev)) {
if (netif_device_present(dev) &&
netif_running(dev) &&
netif_carrier_ok(dev))
>
> 1) the documentation of netif_device_detach says
> "Mark device as removed from system and therefore no longer available."
> which is not true in your case.
Indeed.
>
> 2) netif_device_present(dev) check is used very widely in netdev
> control paths, so if you detach the device while doing one config you
> will loose all other configs and they would wrongfully return -ENODEV;
I went through a few *ioctl and netlink code; it seems that they all
under the protection of RTNL lock and won't be able to return -ENODEV.
>
> calling carrier_off will make the watch-dog back off but still allow
> other configurations.
>
> the carrier is used to mark that the device is not available for tx,
> which is exactly your case since you are resetting the rings, but still
> available for anything else.
>
> looking at other drivers, i couldn't find anyone using the
> detach_device approach to disable the watchdog while resetting the
> rings, but almost most of the drivers are doing the carrier_off.
>
>>> netif_carrier_off(netdev) should do the work, and at the end you
>>> will
>>> need to restore the original carrier state.
>>
>> netif_carrier_off() might trigger a linkwatch event. From this point
>> of
>> view, maybe netif_device_detach() is better.
>>
>
> Code-wise yes netif_device_detach is better, but the implications can
> be worse, as explained above.
>
> My 2 cent is to copy the approach used by most drivers, and just use
> netif_carrier_off(), and maybe in the future we will introduce a more
> relaxed version of carrier off to shut-up the watchdog, and fix all
> drivers at once :).>
> I will leave the decision to you and to this driver maintainers.
>
> Thanks for following up on my comments.
Thank you for such detailed explanation!
// Firo
>
>> Thanks,
>> Firo
>>
>>>
>>> carrier_ok = netif_carrier_ok(netdev);
>>> /* must be called before netif_tx_disable() */
>>> netif_carrier_off(netdev);
>>>
>>> // do stuff
>>>
>>> if (carrier_ok)
>>> netif_carrier_on(netdev);
>>>
>>>> if (netif_running(netdev))
>>>> be_close(netdev);
>>>>
>>>> @@ -4711,21 +4713,21 @@ int be_update_queues(struct be_adapter
>>>> *adapter)
>>>> be_clear_queues(adapter);
>>>> status = be_cmd_if_destroy(adapter, adapter-
>>>>> if_handle, 0);
>>>> if (status)
>>>> - return status;
>>>> + goto out;
>>>>
>>>> if (!msix_enabled(adapter)) {
>>>> status = be_msix_enable(adapter);
>>>> if (status)
>>>> - return status;
>>>> + goto out;
>>>> }
>>>>
>>>> status = be_if_create(adapter);
>>>> if (status)
>>>> - return status;
>>>> + goto out;
>>>>
>>>> status = be_setup_queues(adapter);
>>>> if (status)
>>>> - return status;
>>>> + goto out;
>>>>
>>>> be_schedule_worker(adapter);
>>>>
>>>> @@ -4741,6 +4743,9 @@ int be_update_queues(struct be_adapter
>>>> *adapter)
>>>> if (netif_running(netdev))
>>>> status = be_open(netdev);
>>>>
>>>> +out:
>>>> + dev_watchdog_up();
>>>>
>>>> Thanks,
>>>> Firo
>>>>
>>>>> what you need is proper locking mechanism and/or scheduling the
>>>>> tx_timeout handling out of atomic context if a mutex will be
>>>>> required.
>>>>>
>>>>> netif_device_detach is a too heavy hammer for such
>>>>> synchronizations
>>>>> tasks.
>>>>>
>>>>>> if (netif_running(netdev))
>>>>>> be_close(netdev);
>>>>>>
>>>>>> @@ -4719,21 +4721,21 @@ int be_update_queues(struct
>>>>>> be_adapter
>>>>>> *adapter)
>>>>>> be_clear_queues(adapter);
>>>>>> status = be_cmd_if_destroy(adapter, adapter-
>>>>>>> if_handle, 0);
>>>>>> if (status)
>>>>>> - return status;
>>>>>> + goto out;
>>>>>>
>>>>>> if (!msix_enabled(adapter)) {
>>>>>> status = be_msix_enable(adapter);
>>>>>> if (status)
>>>>>> - return status;
>>>>>> + goto out;
>>>>>> }
>>>>>>
>>>>>> status = be_if_create(adapter);
>>>>>> if (status)
>>>>>> - return status;
>>>>>> + goto out;
>>>>>>
>>>>>> status = be_setup_queues(adapter);
>>>>>> if (status)
>>>>>> - return status;
>>>>>> + goto out;
>>>>>>
>>>>>> be_schedule_worker(adapter);
>>>>>>
>>>>>> @@ -4748,6 +4750,8 @@ int be_update_queues(struct be_adapter
>>>>>> *adapter)
>>>>>> if (netif_running(netdev))
>>>>>> status = be_open(netdev);
>>>>>>
>>>>>> +out:
>>>>>> + netif_device_attach(netdev);
>>>>>> return status;
>>>>>> }
>>>>>>
>>
>>
Powered by blists - more mailing lists