[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <364d994a-9234-fe52-a8ad-ab17934e6205@huawei.com>
Date: Thu, 25 Mar 2021 10:08:47 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Cong Wang <xiyou.wangcong@...il.com>
CC: David Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Vladimir Oltean <olteanv@...il.com>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andriin@...com>,
Eric Dumazet <edumazet@...gle.com>,
Wei Wang <weiwan@...gle.com>,
"Cong Wang ." <cong.wang@...edance.com>,
Taehee Yoo <ap420073@...il.com>,
"Linux Kernel Network Developers" <netdev@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>, <linuxarm@...neuler.org>,
Marc Kleine-Budde <mkl@...gutronix.de>,
<linux-can@...r.kernel.org>, Jamal Hadi Salim <jhs@...atatu.com>,
Jiri Pirko <jiri@...nulli.us>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
John Fastabend <john.fastabend@...il.com>,
<kpsingh@...nel.org>, bpf <bpf@...r.kernel.org>,
Jonas Bonn <jonas.bonn@...rounds.com>,
Paolo Abeni <pabeni@...hat.com>,
Michael Zhivich <mzhivich@...mai.com>,
Josh Hunt <johunt@...mai.com>,
"Jike Song" <albcamus@...il.com>,
Kehuan Feng <kehuan.feng@...il.com>,
Ahmad Fatoum <a.fatoum@...gutronix.de>, <atenart@...nel.org>,
Alexander Duyck <alexander.duyck@...il.com>
Subject: Re: [PATCH net v2] net: sched: fix packet stuck problem for lockless
qdisc
On 2021/3/25 3:20, Cong Wang wrote:
> On Tue, Mar 23, 2021 at 7:24 PM Yunsheng Lin <linyunsheng@...wei.com> wrote:
>> @@ -176,8 +207,23 @@ static inline bool qdisc_run_begin(struct Qdisc *qdisc)
>> static inline void qdisc_run_end(struct Qdisc *qdisc)
>> {
>> write_seqcount_end(&qdisc->running);
>> - if (qdisc->flags & TCQ_F_NOLOCK)
>> + if (qdisc->flags & TCQ_F_NOLOCK) {
>> spin_unlock(&qdisc->seqlock);
>> +
>> + /* qdisc_run_end() is protected by RCU lock, and
>> + * qdisc reset will do a synchronize_net() after
>> + * setting __QDISC_STATE_DEACTIVATED, so testing
>> + * the below two bits separately should be fine.
>
> Hmm, why synchronize_net() after setting this bit is fine? It could
> still be flipped right after you test RESCHEDULE bit.
That depends on when it will be fliped again.
As I see:
1. __QDISC_STATE_DEACTIVATED is set during dev_deactivate() process,
which should also wait for all process related to "test_bit(
__QDISC_STATE_NEED_RESCHEDULE, &q->state)" to finish by calling
synchronize_net() and checking some_qdisc_is_busy().
2. it is cleared during dev_activate() process.
And dev_deactivate() and dev_activate() is protected by RTNL lock, or
serialized by linkwatch.
>
>
>> + * For qdisc_run() in net_tx_action() case, we
>> + * really should provide rcu protection explicitly
>> + * for document purposes or PREEMPT_RCU.
>> + */
>> + if (unlikely(test_bit(__QDISC_STATE_NEED_RESCHEDULE,
>> + &qdisc->state) &&
>> + !test_bit(__QDISC_STATE_DEACTIVATED,
>> + &qdisc->state)))
>
> Why do you want to test __QDISC_STATE_DEACTIVATED bit at all?
> dev_deactivate_many() will wait for those scheduled but being
> deactivated, so what's the problem of scheduling it even with this bit?
The problem I tried to fix is:
CPU0(calling dev_deactivate) CPU1(calling qdisc_run_end) CPU2(calling tx_atcion)
. __netif_schedule() .
. set __QDISC_STATE_SCHED .
. . .
clear __QDISC_STATE_DEACTIVATED . .
synchronize_net() . .
. . .
. . clear __QDISC_STATE_SCHED
. . .
some_qdisc_is_busy() return false . .
. . .
. . qdisc_run()
some_qdisc_is_busy() checks if the qdisc is busy by checking __QDISC_STATE_SCHED
and spin_is_locked(&qdisc->seqlock) for lockless qdisc, and some_qdisc_is_busy()
return false for CPU0 because CPU2 has cleared the __QDISC_STATE_SCHED and has not
taken the qdisc->seqlock yet, qdisc is clearly still busy when qdisc_run() is run
by CPU2 later.
So you are right, testing __QDISC_STATE_DEACTIVATED does not completely solve
the above data race, and there are __netif_schedule() called by dev_requeue_skb()
and __qdisc_run() too, which need the same fixing.
So will remove the __QDISC_STATE_DEACTIVATED testing for this patch first, and
deal with it later.
>
> Thanks.
>
> .
>
Powered by blists - more mailing lists