netdev - Re: [PATCH net-next 2/2] net: sched: Lockless Token Bucket (LTB) Qdisc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <5b02a8df-406a-06c2-3057-d4408ef51057@alibaba-inc.com>
Date:   Sat, 11 Jul 2020 02:01:09 +0800
From:   "YU, Xiangning" <xiangning.yu@...baba-inc.com>
To:     Cong Wang <xiyou.wangcong@...il.com>
Cc:     Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next 2/2] net: sched: Lockless Token Bucket (LTB)
 Qdisc



On 7/9/20 11:21 PM, Cong Wang wrote:
> On Thu, Jul 9, 2020 at 11:07 PM YU, Xiangning
> <xiangning.yu@...baba-inc.com> wrote:
>>
>>
>> On 7/9/20 10:20 PM, Cong Wang wrote:
>>> On Thu, Jul 9, 2020 at 10:04 PM Cong Wang <xiyou.wangcong@...il.com> wrote:
>>>> IOW, without these *additional* efforts, it is broken in terms of
>>>> out-of-order.
>>>>
>>>
>>> Take a look at fq_codel, it provides a hash function for flow classification,
>>> fq_codel_hash(), as default, thus its default configuration does not
>>> have such issues. So, you probably want to provide such a hash
>>> function too instead of a default class.
>>>
>> If I understand this code correctly, this socket hash value identifies a flow. Essentially it serves the same purpose as socket priority. In this patch, we use a similar classification method like htb, but without filters.
> 
> How is it any similar to HTB? HTB does not have a per-cpu queue
> for each class. This is a huge difference.

I said 'similar classification method like htb'. Not similar to HTB in overall design. :)
 
> 
>>
>> We could provide a hash function, but I'm a bit confused about the problem we are trying to solve.
> 
> Probably more than that, you need to ensure the packets in a same flow
> are queued on the same queue.
> 
> Let say you have two packets P1 and P2 from the same flow (P1 is before P2),
> you can classify them into the same class of course, but with per-cpu queues
> they can be sent out in a wrong order too:
> 
> send(P1) on CPU1 -> classify() returns default class -> P1 is queued on
> the CPU1 queue of default class
> 
> (Now process is migrated to CPU2)
> 
> send(P2) on CPU2 -> classify() returns default class -> P2 is queued on
> the CPU2 queue of default class
> 
> P2 is dequeued on CPU2 before P1 dequeued on CPU1.
> 
> Now, out of order. :)
> 
> Hope it is clear now.

The assumption is that packet scheduler is faster than thread migration. If we constantly take that long to send one packet, we need to fix it.

Under light load, CPU1 is free enough to immediately trigger aggregation. Under heavy load, some other CPU could also help trigger aggregation in between.

As I responded in my first email, this is possible in theory. We are running a much lower kernel version, and the normal case is to have multiple flows/threads running in a large class, so far haven't seen big trouble with this approach.

But it's been years, if the above assumption changes, we do need to rethink about it.

Thanks,
- Xiangning
> 
> Thanks.
>