[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpWLtr_O25gwwUnJSm6f7cK=VSSo92oL9qKrEXnwaMqjsg@mail.gmail.com>
Date: Fri, 22 Dec 2017 13:41:01 -0800
From: Cong Wang <xiyou.wangcong@...il.com>
To: John Fastabend <john.fastabend@...il.com>
Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>,
Jakub Kicinski <jakub.kicinski@...ronome.com>
Subject: Re: [Patch net-next] net_sched: call qdisc_reset() with qdisc lock
On Thu, Dec 21, 2017 at 7:36 PM, John Fastabend
<john.fastabend@...il.com> wrote:
> On 12/21/2017 04:03 PM, Cong Wang wrote:
>> qdisc_reset() should always be called with qdisc spinlock
>> and with BH disabled, otherwise qdisc ->reset() could race
>> with TX BH.
>>
> hmm I don't see how this fixes the issue. pfifo_fast is no longer
> using the qdisc lock so that doesn't help. And it is only a
> local_bh_disable.
First of all, this is to fix non-pfifo_fast which you seem totally forget.
>
>
>> Fixes: 7bbde83b1860 ("net: sched: drop qdisc_reset from dev_graft_qdisc")
>> Reported-by: Jakub Kicinski <jakub.kicinski@...ronome.com>
>> Cc: John Fastabend <john.fastabend@...il.com>
>> Signed-off-by: Cong Wang <xiyou.wangcong@...il.com>
>> ---
>> net/sched/sch_generic.c | 5 ++++-
>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
>> index 10aaa3b615ce..00ddb5f8f430 100644
>> --- a/net/sched/sch_generic.c
>> +++ b/net/sched/sch_generic.c
>> @@ -1097,8 +1097,11 @@ static void dev_qdisc_reset(struct net_device *dev,
>> {
>> struct Qdisc *qdisc = dev_queue->qdisc_sleeping;
>>
>> - if (qdisc)
>> + if (qdisc) {
>> + spin_lock_bh(qdisc_lock(qdisc));
>> qdisc_reset(qdisc);
>> + spin_unlock_bh(qdisc_lock(qdisc));
>> + }
>> }
>>
>> /**
>>
>
> OK first the cases to get to qdisc_reset that I've tracked
> down are,
>
> dev_shutdown()
> qdisc_destroy()
>
> dev_deactivate_many()
> dev_qdisc_reset() <- for each txq
> qdisc_reset()
>
> chained calls from qdisc_reset ops
>
> At the moment all the lockless qdiscs don't care about chained
> calls so we can ignore that, but would be nice to keep in mind.
>
> Next qdisc_reset() is doing a couple things calling the qdisc
> ops reset call but also walking gso_skb and skb_bad_txq. The
> 'unlink' operations there are not safe to be called while an
> enqueue/dequeue op is in-flight. Also pfifo_fast's reset op
> is not safe to be called with enqueue/dequeue ops in-flight.
This is why I sent two patches instead just this one.
>
> So I've made the assumption that qdisc_reset is _only_ ever
> called after a qdisc is no longer attached on the enqueue
> dev_xmit side and also any in-progress tx_action calls are
> completed. For what its worth this has always been the assumption
> AFAIK.
>
> So those are the assumptions what did I miss?
Speaking of this, qdisc_reset() is also called in dev_deactivate_queue()
even before synchronize_net()...
>
> The biggest gap I see is dev_deactivate_many() is supposed
> to wait for all tx_action calls to complete, this bit:
In some_qdisc_is_busy() you hold qdisc spinlock for non-lockless
ones, I don't understand why you don't want it in dev_qdisc_reset().
Powered by blists - more mailing lists