linux-kernel - RE: [PATCH v2 net-next] net: link_watch: prevent starvation when processing linkwatch wq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <8a93eecf7a7a4ffd81f1b7d08f1a7442@huawei.com>
Date:   Fri, 31 May 2019 11:17:00 +0000
From:   Salil Mehta <salil.mehta@...wei.com>
To:     linyunsheng <linyunsheng@...wei.com>,
        "davem@...emloft.net" <davem@...emloft.net>
CC:     "hkallweit1@...il.com" <hkallweit1@...il.com>,
        "f.fainelli@...il.com" <f.fainelli@...il.com>,
        "stephen@...workplumber.org" <stephen@...workplumber.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Linuxarm <linuxarm@...wei.com>
Subject: RE: [PATCH v2 net-next] net: link_watch: prevent starvation when
 processing linkwatch wq

> From: netdev-owner@...r.kernel.org [mailto:netdev-
> owner@...r.kernel.org] On Behalf Of Yunsheng Lin
> Sent: Friday, May 31, 2019 10:01 AM
> To: davem@...emloft.net
> Cc: hkallweit1@...il.com; f.fainelli@...il.com;
> stephen@...workplumber.org; netdev@...r.kernel.org; linux-
> kernel@...r.kernel.org; Linuxarm <linuxarm@...wei.com>
> Subject: [PATCH v2 net-next] net: link_watch: prevent starvation when
> processing linkwatch wq
> 
> When user has configured a large number of virtual netdev, such
> as 4K vlans, the carrier on/off operation of the real netdev
> will also cause it's virtual netdev's link state to be processed
> in linkwatch. Currently, the processing is done in a work queue,
> which may cause cpu and rtnl locking starvation problem.
> 
> This patch releases the cpu and rtnl lock when link watch worker
> has processed a fixed number of netdev' link watch event.
> 
> Currently __linkwatch_run_queue is called with rtnl lock, so
> enfore it with ASSERT_RTNL();
> 
> Signed-off-by: Yunsheng Lin <linyunsheng@...wei.com>
> ---
> V2: use cond_resched and rtnl_unlock after processing a fixed
>     number of events
> ---
>  net/core/link_watch.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/net/core/link_watch.c b/net/core/link_watch.c
> index 7f51efb..07eebfb 100644
> --- a/net/core/link_watch.c
> +++ b/net/core/link_watch.c
> @@ -168,9 +168,18 @@ static void linkwatch_do_dev(struct net_device
> *dev)
> 
>  static void __linkwatch_run_queue(int urgent_only)
>  {
> +#define MAX_DO_DEV_PER_LOOP	100
> +
> +	int do_dev = MAX_DO_DEV_PER_LOOP;
>  	struct net_device *dev;
>  	LIST_HEAD(wrk);
> 
> +	ASSERT_RTNL();
> +
> +	/* Give urgent case more budget */
> +	if (urgent_only)
> +		do_dev += MAX_DO_DEV_PER_LOOP;
> +
>  	/*
>  	 * Limit the number of linkwatch events to one
>  	 * per second so that a runaway driver does not
> @@ -200,6 +209,14 @@ static void __linkwatch_run_queue(int urgent_only)
>  		}
>  		spin_unlock_irq(&lweventlist_lock);
>  		linkwatch_do_dev(dev);
> +
> +		if (--do_dev < 0) {
> +			rtnl_unlock();
> +			cond_resched();



Sorry, missed in my earlier comment. I could see multiple problems here
and please correct me if I am wrong:

1. It looks like releasing the rtnl_lock here and then res-scheduling might
   not be safe, especially when you have already held *lweventlist_lock*
   (which is global and not per-netdev), and when you are trying to
   reschedule. This can cause *deadlock* with itself.

   Reason: once you release the rtnl_lock() the similar leg of function 
   netdev_wait_allrefs() could be called for some other netdevice which
   might end up in waiting for same global linkwatch event list lock
   i.e. *lweventlist_lock*.

2. After releasing the rtnl_lock() we have not ensured that all the rcu
   operations are complete. Perhaps we need to take rcu_barrier() before
   retaking the rtnl_lock()




> +			do_dev = MAX_DO_DEV_PER_LOOP;



Here, I think rcu_barrier() should exist.



> +			rtnl_lock();
> +		}
> +
>  		spin_lock_irq(&lweventlist_lock);
>  	}