[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpXiNaushoWjma44X_agPosg9AKk4RzFTX93MSVTM6z3Uw@mail.gmail.com>
Date: Thu, 17 Sep 2020 12:52:32 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: Yunsheng Lin <linyunsheng@...wei.com>
Cc: Kehuan Feng <kehuan.feng@...il.com>,
Hillf Danton <hdanton@...a.com>,
Paolo Abeni <pabeni@...hat.com>,
Jike Song <albcamus@...il.com>, Josh Hunt <johunt@...mai.com>,
Jonas Bonn <jonas.bonn@...rounds.com>,
Michael Zhivich <mzhivich@...mai.com>,
David Miller <davem@...emloft.net>,
John Fastabend <john.fastabend@...il.com>,
LKML <linux-kernel@...r.kernel.org>,
Netdev <netdev@...r.kernel.org>,
"linuxarm@...wei.com" <linuxarm@...wei.com>
Subject: Re: Packet gets stuck in NOLOCK pfifo_fast qdisc
On Sun, Sep 13, 2020 at 7:10 PM Yunsheng Lin <linyunsheng@...wei.com> wrote:
>
> On 2020/9/11 4:19, Cong Wang wrote:
> > On Thu, Sep 3, 2020 at 8:21 PM Kehuan Feng <kehuan.feng@...il.com> wrote:
> >> I also tried Cong's patch (shown below on my tree) and it could avoid
> >> the issue (stressing for 30 minutus for three times and not jitter
> >> observed).
> >
> > Thanks for verifying it!
> >
> >>
> >> --- ./include/net/sch_generic.h.orig 2020-08-21 15:13:51.787952710 +0800
> >> +++ ./include/net/sch_generic.h 2020-09-03 21:36:11.468383738 +0800
> >> @@ -127,8 +127,7 @@
> >> static inline bool qdisc_run_begin(struct Qdisc *qdisc)
> >> {
> >> if (qdisc->flags & TCQ_F_NOLOCK) {
> >> - if (!spin_trylock(&qdisc->seqlock))
> >> - return false;
> >> + spin_lock(&qdisc->seqlock);
> >> } else if (qdisc_is_running(qdisc)) {
> >> return false;
> >> }
> >>
> >> I am not actually know what you are discussing above. It seems to me
> >> that Cong's patch is similar as disabling lockless feature.
> >
> >>From performance's perspective, yeah. Did you see any performance
> > downgrade with my patch applied? It would be great if you can compare
> > it with removing NOLOCK. And if the performance is as bad as no
> > NOLOCK, then we can remove the NOLOCK bit for pfifo_fast, at least
> > for now.
>
> It seems the lockless qdisc may have below concurrent problem:
> cpu0: cpu1:
> q->enqueue .
> qdisc_run_begin(q) .
> __qdisc_run(q) ->qdisc_restart() -> dequeue_skb() .
> -> sch_direct_xmit() .
> .
> q->enqueue
> qdisc_run_begin(q)
> qdisc_run_end(q)
>
>
> cpu1 enqueue a skb without calling __qdisc_run(), and cpu0 did not see the
> enqueued skb when calling __qdisc_run(q) because cpu1 may enqueue the skb
> after cpu0 called __qdisc_run(q) and before cpu0 called qdisc_run_end(q).
This is the same problem that my patch fixes, I do not know
why you are suggesting another patch despite quoting mine.
Please read the whole thread if you want to participate.
Thanks.
Powered by blists - more mailing lists