netdev - Re: [RFC PATCH 06/12] net: sched: support qdisc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <20160115.144426.47703266552549103.davem@davemloft.net>
Date:	Fri, 15 Jan 2016 14:44:26 -0500 (EST)
From:	David Miller <davem@...emloft.net>
To:	john.fastabend@...il.com
Cc:	daniel@...earbox.net, eric.dumazet@...il.com, jhs@...atatu.com,
	aduyck@...antis.com, brouer@...hat.com, john.r.fastabend@...el.com,
	netdev@...r.kernel.org
Subject: Re: [RFC PATCH 06/12] net: sched: support qdisc_reset on NOLOCK
 qdisc

From: John Fastabend <john.fastabend@...il.com>
Date: Wed, 13 Jan 2016 10:03:54 -0800

> On 16-01-13 08:20 AM, David Miller wrote:
>> From: John Fastabend <john.fastabend@...il.com>
>> Date: Wed, 30 Dec 2015 09:53:13 -0800
>> 
>>>  case 2: dev_deactivate sequence. This can come from a user bringing
>>> 	 the interface down which causes the gso_skb list to be flushed
>>> 	 and the qlen zero'd. At the moment this is protected by the
>>> 	 qdisc lock so while we clear the qlen/gso_skb fields we are
>>> 	 guaranteed no new skbs are added. For the lockless case
>>> 	 though this is not true. To resolve this move the qdisc_reset
>>> 	 call after the new qdisc is assigned and a grace period is
>>> 	 exercised to ensure no new skbs can be enqueued. Further
>>> 	 the RTNL lock is held so we can not get another call to
>>> 	 activate the qdisc while the skb lists are being free'd.
>>>
>>> 	 Finally, fix qdisc_reset to handle the per cpu stats and
>>> 	 skb lists.
>> 
>> Just wanted to note that some setups are sensitive to device
>> register/deregister costs.  This is why we batch register and
>> unregister operations in the core, so that the RCU grace period
>> is consolidated into one when we register/unregister a lot of
>> net devices.
>> 
>> If we now will incur a new per-device unregister RCU grace period
>> when the qdisc is destroyed, it could cause a regression.
>> 
> 
> It adds a synchronize_net in the error case for many users of
> unregister_netdevice(). I think this should be rare and I believe
> its OK to add the extra sync net in these cases. For example this
> may happen when we try to add a tunnel and __dev_get_by_name() fails.
> But if your worried about bring up, tear down performance I think you
> should be using ifindex numbers and also not fat fingering dev
> names on the cli.

Ok, agreed.

> Also there are a few drivers still doing their own walking of lists
> and calling unregister_netdevice() directly instead of the better
> APIs like unregister_netdevice_queue() and friends. I can patch these
> drivers if that helps its a mechanical change but I'm not super
> excited about testing things like the caif driver ;)

No, that's not necessary.  If anyone works on that, I'd rather it be
someone interested in the individual drivers and therefore has the
ability to test it easily.

> Further just looking at it now there are three calls to sync net in
> the dev down paths. It seems we should be able to remove at least one
> of those if we re-organize the tear down a bit better. But that is
> another patch series.

Indeed, there isn't any reason why these can't be consolidated into
two or even just one.