netdev - Re: [PATCH v3 net-next] net/sched: add skbprio scheduler

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9322b915-8193-7011-e9ba-e7b22ceb08ba@digirati.com.br>
Date:   Tue, 10 Jul 2018 10:03:22 -0400
From:   Michel Machado <michel@...irati.com.br>
To:     Marcelo Ricardo Leitner <marcelo.leitner@...il.com>
Cc:     Nishanth Devarajan <ndev2021@...il.com>, xiyou.wangcong@...il.com,
        jhs@...atatu.com, jiri@...nulli.us, davem@...emloft.net,
        netdev@...r.kernel.org, doucette@...edu
Subject: Re: [PATCH v3 net-next] net/sched: add skbprio scheduler

On 07/09/2018 05:40 PM, Marcelo Ricardo Leitner wrote:
> On Mon, Jul 09, 2018 at 05:03:31PM -0400, Michel Machado wrote:
>> On 07/09/2018 03:53 PM, Marcelo Ricardo Leitner wrote:
>>> On Mon, Jul 09, 2018 at 02:18:33PM -0400, Michel Machado wrote:
>>>> On 07/09/2018 11:44 AM, Marcelo Ricardo Leitner wrote:
>>>>> On Sat, Jul 07, 2018 at 03:43:55PM +0530, Nishanth Devarajan wrote:
>>>>>> net/sched: add skbprio scheduer
>>>>>>
>>>>>> Skbprio (SKB Priority Queue) is a queueing discipline that prioritizes packets
>>>>>> according to their skb->priority field. Under congestion, already-enqueued lower
>>>>>> priority packets will be dropped to make space available for higher priority
>>>>>> packets. Skbprio was conceived as a solution for denial-of-service defenses that
>>>>>> need to route packets with different priorities as a means to overcome DoS
>>>>>> attacks.
>>>>>
>>>>> Why can't we implement this as a new flag for sch_prio.c?
>>>>>
>>>>> I don't see why this duplication is needed, especially because it will
>>>>> only be "slower" (as in, it will do more work) when qdisc is already
>>>>> full and dropping packets anyway.
>>>>
>>>>      sch_prio.c and skbprio diverge on a number of aspects:
>>>>
>>>>      1. sch_prio.c supports up to 16 priorities whereas skbprio 64. This is
>>>> not just a matter of changing a constant since sch_prio.c doesn't use
>>>> skb->priority.
>>>
>>> Yes it does use skb->priority for classifying into a band:
>>>
>>> prio_classify(struct sk_buff *skb, struct Qdisc *sch, int *qerr)
>>> {
>>>           struct prio_sched_data *q = qdisc_priv(sch);
>>>           u32 band = skb->priority;
>>> ...
>>
>>     Changing TC_PRIO_MAX from 15 to 63 risks breaking backward compatibility
>> with applications.
> 
> If done, it needs to be done carefully, indeed. I don't know if it's
> doable, neither I know how hard is your requirement for 64 different
> priorities.
> 
> You can get 64 different priorities by stacking sch_prio, btw. And if
> you implement drop_from_tail() as part of Qdisc, you can even get it
> working for this cascading case too.

    A solution would be to add another flag to switch between the 
current prio_classify() and a new one to just use skb->priority as in 
skbprio. This way we don't risk breaking applications that rely on 
tcf_classify() and this odd behavior that I found in prio_classify():

         band = TC_H_MIN(band) - 1;
         if (band >= q->bands)
                 return q->queues[q->prio2band[0]];
         return q->queues[band];

    When band is zero, it returns q->queues[q->prio2band[0]] instead of 
q->queues[band] as it would for other bands less than q->bands.

>>>>      3. The queues of sch_prio.c are struct Qdisc, which don't have a method
>>>> to drop at its tail.
>>>
>>> That can be implemented, most likely as prio_tail_drop() as above.
>>
>>     struct Qdisc represents *all* qdiscs. My knowledge of the other qdiscs is
>> limited, but not all qdiscs may have a meaningful method to drop at the
>> tail. For example: a qdisc that works over flows may not know with flow is
> 
> True, but it doesn't mean you have to implement it for all available qdiscs.

    If it is not implemented for all available qdiscs and the flag to 
drop at the tail is on, sch_prio.c would need to issue a log message 
whenever a packet goes into one of the subqueues that don't drop at the 
tail and have a failsafe behavior.

>> the tail. Not to mention that this would be a widespread patch to only
>> support this new prio qdisc. It would be prudent to wait for the production
>> success of the proposed, self-contained qdisc before making this commitment.
> 
> On the other hand, by adding another qdisc you're adding more work
> that one needs to do when dealing with qdisc infrastructure, such as
> updating enqueue() prototype, for example.
> 
> Once this new qdisc is in, it won't be easy to deprecate it.

    We need to choose between (1) having skbprio that has some duplicate 
code with sch_prio.c and (2) adding flags to sch_prio.c and make a major 
refactoring of the schedule subsystem to add drop an the tail to qdiscs.

    I mean major because we are not just talking about adding the method 
dequeue_tail() to struct Qdisc and adding dequeue_tail() to all qdiscs. 
One will need to come up with definitions of dequeue_tail() for qdiscs 
that don't naturally have it and even rewrite the data structures of 
qdiscs. To substantiate this last point, consider sch_fifo.c, one of the 
simplest qdiscs available. sch_fifo.c keeps its packets in sch->q, which 
is of type struct qdisc_skb_head. struct qdisc_skb_head doesn't set 
skb->prev, so it cannot drop at the tail without walking through its list.

    I do understand the motivation for minimizing duplicate code. But 
the small amount of duplicate code that skbprio adds is cheaper than 
refactoring the scheduler system to only support this new sch_prio.c.

[ ]'s
Michel Machado