lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 13 Apr 2021 15:57:29 +0800 From: Yunsheng Lin <linyunsheng@...wei.com> To: Hillf Danton <hdanton@...a.com> CC: Juergen Gross <jgross@...e.com>, <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>, Jiri Kosina <JKosina@...e.com> Subject: Re: [PATCH net v3] net: sched: fix packet stuck problem for lockless qdisc On 2021/4/13 15:12, Hillf Danton wrote: > On Tue, 13 Apr 2021 11:34:27 Yunsheng Lin wrote: >> On 2021/4/13 11:26, Hillf Danton wrote: >>> On Tue, 13 Apr 2021 10:56:42 Yunsheng Lin wrote: >>>> On 2021/4/13 10:21, Hillf Danton wrote: >>>>> On Mon, 12 Apr 2021 20:00:43 Yunsheng Lin wrote: >>>>>> >>>>>> Yes, the below patch seems to fix the data race described in >>>>>> the commit log. >>>>>> Then what is the difference between my patch and your patch below:) >>>>> >>>>> Hehe, this is one of the tough questions over a bounch of weeks. >>>>> >>>>> If a seqcount can detect the race between skb enqueue and dequeue then we >>>>> cant see any excuse for not rolling back to the point without NOLOCK. >>>> >>>> I am not sure I understood what you meant above. >>>> >>>> As my understanding, the below patch is essentially the same as >>>> your previous patch, the only difference I see is it uses qdisc->pad >>>> instead of __QDISC_STATE_NEED_RESCHEDULE. >>>> >>>> So instead of proposing another patch, it would be better if you >>>> comment on my patch, and make improvement upon that. >>>> >>> Happy to do that after you show how it helps revert NOLOCK. >> >> Actually I am not going to revert NOLOCK, but add optimization >> to it if the patch fixes the packet stuck problem. >> > Fix is not optimization, right? For this patch, it is a fix. In case you missed it, I do have a couple of idea to optimize the lockless qdisc: 1. RFC patch to add lockless qdisc bypass optimization: https://patchwork.kernel.org/project/netdevbpf/patch/1616404156-11772-1-git-send-email-linyunsheng@huawei.com/ 2. implement lockless enqueuing for lockless qdisc using the idea from Jason and Toke. And it has a noticable proformance increase with 1-4 threads running using the below prototype based on ptr_ring. static inline int __ptr_ring_multi_produce(struct ptr_ring *r, void *ptr) { int producer, next_producer; do { producer = READ_ONCE(r->producer); if (unlikely(!r->size) || r->queue[producer]) return -ENOSPC; next_producer = producer + 1; if (unlikely(next_producer >= r->size)) next_producer = 0; } while(cmpxchg_relaxed(&r->producer, producer, next_producer) != producer); /* Make sure the pointer we are storing points to a valid data. */ /* Pairs with the dependency ordering in __ptr_ring_consume. */ smp_wmb(); WRITE_ONCE(r->queue[producer], ptr); return 0; } 3. Maybe it is possible to remove the netif_tx_lock for lockless qdisc too, because dev_hard_start_xmit is also in the protection of qdisc_run_begin()/qdisc_run_end()(if there is only one qdisc using a netdev queue, which is true for pfifo_fast, I believe). 4. Remove the qdisc->running seqcount operation for lockless qdisc, which is mainly used to do heuristic locking on q->busylock for locked qdisc. > >> Is there any reason why you want to revert it? >> > I think you know Jiri's plan and it would be nice to wait a couple of > months for it to complete. I am not sure I am aware of Jiri's plan. Is there any link referring to the plan? > > . >
Powered by blists - more mailing lists