netdev - Re: [PATCH net v3] net: sched: fix packet stuck problem for lockless qdisc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <ca1da322-0083-c6e7-d905-c75572b5fdf2@huawei.com>
Date:   Tue, 13 Apr 2021 15:57:29 +0800
From:   Yunsheng Lin <linyunsheng@...wei.com>
To:     Hillf Danton <hdanton@...a.com>
CC:     Juergen Gross <jgross@...e.com>, <netdev@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, Jiri Kosina <JKosina@...e.com>
Subject: Re: [PATCH net v3] net: sched: fix packet stuck problem for lockless
 qdisc

On 2021/4/13 15:12, Hillf Danton wrote:
> On Tue, 13 Apr 2021 11:34:27 Yunsheng Lin wrote:
>> On 2021/4/13 11:26, Hillf Danton wrote:
>>> On Tue, 13 Apr 2021 10:56:42 Yunsheng Lin wrote:
>>>> On 2021/4/13 10:21, Hillf Danton wrote:
>>>>> On Mon, 12 Apr 2021 20:00:43  Yunsheng Lin wrote:
>>>>>>
>>>>>> Yes, the below patch seems to fix the data race described in
>>>>>> the commit log.
>>>>>> Then what is the difference between my patch and your patch below:)
>>>>>
>>>>> Hehe, this is one of the tough questions over a bounch of weeks.
>>>>>
>>>>> If a seqcount can detect the race between skb enqueue and dequeue then we
>>>>> cant see any excuse for not rolling back to the point without NOLOCK.
>>>>
>>>> I am not sure I understood what you meant above.
>>>>
>>>> As my understanding, the below patch is essentially the same as
>>>> your previous patch, the only difference I see is it uses qdisc->pad
>>>> instead of __QDISC_STATE_NEED_RESCHEDULE.
>>>>
>>>> So instead of proposing another patch, it would be better if you
>>>> comment on my patch, and make improvement upon that.
>>>>
>>> Happy to do that after you show how it helps revert NOLOCK.
>>
>> Actually I am not going to revert NOLOCK, but add optimization
>> to it if the patch fixes the packet stuck problem.
>>
> Fix is not optimization, right?

For this patch, it is a fix.
In case you missed it, I do have a couple of idea to optimize the
lockless qdisc:

1. RFC patch to add lockless qdisc bypass optimization:

https://patchwork.kernel.org/project/netdevbpf/patch/1616404156-11772-1-git-send-email-linyunsheng@huawei.com/

2. implement lockless enqueuing for lockless qdisc using the idea
   from Jason and Toke. And it has a noticable proformance increase with
   1-4 threads running using the below prototype based on ptr_ring.

static inline int __ptr_ring_multi_produce(struct ptr_ring *r, void *ptr)
{

        int producer, next_producer;


        do {
                producer = READ_ONCE(r->producer);
                if (unlikely(!r->size) || r->queue[producer])
                        return -ENOSPC;
                next_producer = producer + 1;
                if (unlikely(next_producer >= r->size))
                        next_producer = 0;
        } while(cmpxchg_relaxed(&r->producer, producer, next_producer) != producer);

        /* Make sure the pointer we are storing points to a valid data. */
        /* Pairs with the dependency ordering in __ptr_ring_consume. */
        smp_wmb();

        WRITE_ONCE(r->queue[producer], ptr);
        return 0;
}

3. Maybe it is possible to remove the netif_tx_lock for lockless qdisc
   too, because dev_hard_start_xmit is also in the protection of
   qdisc_run_begin()/qdisc_run_end()(if there is only one qdisc using
   a netdev queue, which is true for pfifo_fast, I believe).

4. Remove the qdisc->running seqcount operation for lockless qdisc, which
   is mainly used to do heuristic locking on q->busylock for locked qdisc.

> 
>> Is there any reason why you want to revert it?
>>
> I think you know Jiri's plan and it would be nice to wait a couple of
> months for it to complete.

I am not sure I am aware of Jiri's plan.
Is there any link referring to the plan?

> 
> .
>