[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 31 Dec 2015 18:30:04 -0800
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: John Fastabend <john.fastabend@...il.com>
Cc: daniel@...earbox.net, eric.dumazet@...il.com, jhs@...atatu.com,
aduyck@...antis.com, brouer@...hat.com, davem@...emloft.net,
john.r.fastabend@...el.com, netdev@...r.kernel.org
Subject: Re: [RFC PATCH 06/12] net: sched: support qdisc_reset on NOLOCK qdisc
On Wed, Dec 30, 2015 at 09:53:13AM -0800, John Fastabend wrote:
> The qdisc_reset operation depends on the qdisc lock at the moment
> to halt any additions to gso_skb and statistics while the list is
> free'd and the stats zeroed.
>
> Without the qdisc lock we can not guarantee another cpu is not in
> the process of adding a skb to one of the "cells". Here are the
> two cases we have to handle.
>
> case 1: qdisc_graft operation. In this case a "new" qdisc is attached
> and the 'qdisc_destroy' operation is called on the old qdisc.
> The destroy operation will wait a rcu grace period and call
> qdisc_rcu_free(). At which point gso_cpu_skb is free'd along
> with all stats so no need to zero stats and gso_cpu_skb from
> the reset operation itself.
>
> Because we can not continue to call qdisc_reset before waiting
> an rcu grace period so that the qdisc is detached from all
> cpus simply do not call qdisc_reset() at all and let the
> qdisc_destroy operation clean up the qdisc. Note, a refcnt
> greater than 1 would cause the destroy operation to be
> aborted however if this ever happened the reference to the
> qdisc would be lost and we would have a memory leak.
>
> case 2: dev_deactivate sequence. This can come from a user bringing
> the interface down which causes the gso_skb list to be flushed
> and the qlen zero'd. At the moment this is protected by the
> qdisc lock so while we clear the qlen/gso_skb fields we are
> guaranteed no new skbs are added. For the lockless case
> though this is not true. To resolve this move the qdisc_reset
> call after the new qdisc is assigned and a grace period is
> exercised to ensure no new skbs can be enqueued. Further
> the RTNL lock is held so we can not get another call to
> activate the qdisc while the skb lists are being free'd.
>
> Finally, fix qdisc_reset to handle the per cpu stats and
> skb lists.
>
> Signed-off-by: John Fastabend <john.r.fastabend@...el.com>
...
> - /* Prune old scheduler */
> - if (oqdisc && atomic_read(&oqdisc->refcnt) <= 1)
> - qdisc_reset(oqdisc);
> -
...
> - sync_needed |= !dev->dismantle;
> + sync_needed = true;
I think killing above <=1 check and forcing synchronize_net() will
make qdisc destruction more reliable than it's right now.
Your commit log sounds too pessimistic :)
btw, sync_needed variable can be removed as well.
All other patches look good. Great stuff overall!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists