[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0Uc=t9sTn47WpJikMgaPdGSTpYxCxQBpBZ1D=iR9mgz9pg@mail.gmail.com>
Date: Mon, 13 Nov 2017 14:47:11 -0800
From: Alexander Duyck <alexander.duyck@...il.com>
To: Michael Ma <make0818@...il.com>
Cc: Stephen Hemminger <stephen@...workplumber.org>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
jianjun.duan@...baba-inc.com, xiangning.yu@...baba-inc.com
Subject: Re: Per-CPU Queueing for QoS
On Mon, Nov 13, 2017 at 10:17 AM, Michael Ma <make0818@...il.com> wrote:
> 2017-11-12 16:14 GMT-08:00 Stephen Hemminger <stephen@...workplumber.org>:
>> On Sun, 12 Nov 2017 13:43:13 -0800
>> Michael Ma <make0818@...il.com> wrote:
>>
>>> Any comments? We plan to implement this as a qdisc and appreciate any early feedback.
>>>
>>> Thanks,
>>> Michael
>>>
>>> > On Nov 9, 2017, at 5:20 PM, Michael Ma <make0818@...il.com> wrote:
>>> >
>>> > Currently txq/qdisc selection is based on flow hash so packets from
>>> > the same flow will follow the order when they enter qdisc/txq, which
>>> > avoids out-of-order problem.
>>> >
>>> > To improve the concurrency of QoS algorithm we plan to have multiple
>>> > per-cpu queues for a single TC class and do busy polling from a
>>> > per-class thread to drain these queues. If we can do this frequently
>>> > enough the out-of-order situation in this polling thread should not be
>>> > that bad.
>>> >
>>> > To give more details - in the send path we introduce per-cpu per-class
>>> > queues so that packets from the same class and same core will be
>>> > enqueued to the same place. Then a per-class thread poll the queues
>>> > belonging to its class from all the cpus and aggregate them into
>>> > another per-class queue. This can effectively reduce contention but
>>> > inevitably introduces potential out-of-order issue.
>>> >
>>> > Any concern/suggestion for working towards this direction?
>>
>> In general, there is no meta design discussions in Linux development
>> Several developers have tried to do lockless
>> qdisc and similar things in the past.
>>
>> The devil is in the details, show us the code.
>
> Thanks for the response, Stephen. The code is fairly straightforward,
> we have a per-cpu per-class queue defined as this:
>
> struct bandwidth_group
> {
> struct skb_list queues[MAX_CPU_COUNT];
> struct skb_list drain;
> }
>
> "drain" queue is used to aggregate per-cpu queues belonging to the
> same class. In the enqueue function, we determine the cpu where the
> packet is processed and enqueue it to the corresponding per-cpu queue:
>
> int cpu;
> struct bandwidth_group *bwg = &bw_rx_groups[bwgid];
>
> cpu = get_cpu();
> skb_list_append(&bwg->queues[cpu], skb);
>
> Here we don't check the flow of the packet so if there is task
> migration or multiple threads sending packets through the same flow we
> theoretically can have packets enqueued to different queues and
> aggregated to the "drain" queue out of order.
>
> Also AFAIK there is no lockless htb-like qdisc implementation
> currently, however if there is already similar effort ongoing please
> let me know.
The question I would have is how would this differ from using XPS w/
mqprio? Would this be a classful qdisc like HTB or a classless one
like mqprio?
>From what I can tell XPS would be able to get you your per-cpu
functionality, the benefit of it though would be that it would avoid
out-of-order issues for sockets originating on the local system. The
only thing I see as an issue right now is that the rate limiting with
mqprio is assumed to be handled via hardware due to mechanisms such as
DCB.
- Alex
Powered by blists - more mailing lists