lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 10 Jul 2018 11:33:16 -0300
From:   Marcelo Ricardo Leitner <marcelo.leitner@...il.com>
To:     Michel Machado <michel@...irati.com.br>
Cc:     Nishanth Devarajan <ndev2021@...il.com>, xiyou.wangcong@...il.com,
        jhs@...atatu.com, jiri@...nulli.us, davem@...emloft.net,
        netdev@...r.kernel.org, doucette@...edu
Subject: Re: [PATCH v3 net-next] net/sched: add skbprio scheduler

On Tue, Jul 10, 2018 at 10:03:22AM -0400, Michel Machado wrote:
...
> > You can get 64 different priorities by stacking sch_prio, btw. And if
> > you implement drop_from_tail() as part of Qdisc, you can even get it
> > working for this cascading case too.
> 
>    A solution would be to add another flag to switch between the current
> prio_classify() and a new one to just use skb->priority as in skbprio. This

Sounds promising.

> way we don't risk breaking applications that rely on tcf_classify() and this
> odd behavior that I found in prio_classify():
> 
>         band = TC_H_MIN(band) - 1;
>         if (band >= q->bands)
>                 return q->queues[q->prio2band[0]];
>         return q->queues[band];
> 
>    When band is zero, it returns q->queues[q->prio2band[0]] instead of
> q->queues[band] as it would for other bands less than q->bands.

Agreed, this looks odd. It came from 1d8ae3fdeb00 ("pkt_sched: Remove
RR scheduler."):

        band = TC_H_MIN(band) - 1;
        if (band >= q->bands)
-               band = q->prio2band[0];
-out:
-       if (q->mq)
-               skb_set_queue_mapping(skb, band);
+               return q->queues[q->prio2band[0]];
+
        return q->queues[band];
 }

I can see how it made sense before the change, but not after.

> 
> > > > >      3. The queues of sch_prio.c are struct Qdisc, which don't have a method
> > > > > to drop at its tail.
> > > > 
> > > > That can be implemented, most likely as prio_tail_drop() as above.
> > > 
> > >     struct Qdisc represents *all* qdiscs. My knowledge of the other qdiscs is
> > > limited, but not all qdiscs may have a meaningful method to drop at the
> > > tail. For example: a qdisc that works over flows may not know with flow is
> > 
> > True, but it doesn't mean you have to implement it for all available qdiscs.
> 
>    If it is not implemented for all available qdiscs and the flag to drop at
> the tail is on, sch_prio.c would need to issue a log message whenever a
> packet goes into one of the subqueues that don't drop at the tail and have a
> failsafe behavior.

That's fine. pr_warn_ratelimit() is probably what we need for logging
the error, so it a) doesn't flood kernel log and b) gets activated
even if the sysadmin later try again with another qdisc (as opposed to
pr_warn_once).

For the failsafe behavior, it probably can then just drop the incoming
packet. It is not what you want, yes, but it's an easy way out out of
a non-expected situation and that works well enough.

> 
> > > the tail. Not to mention that this would be a widespread patch to only
> > > support this new prio qdisc. It would be prudent to wait for the production
> > > success of the proposed, self-contained qdisc before making this commitment.
> > 
> > On the other hand, by adding another qdisc you're adding more work
> > that one needs to do when dealing with qdisc infrastructure, such as
> > updating enqueue() prototype, for example.
> > 
> > Once this new qdisc is in, it won't be easy to deprecate it.
> 
>    We need to choose between (1) having skbprio that has some duplicate code
> with sch_prio.c and (2) adding flags to sch_prio.c and make a major
> refactoring of the schedule subsystem to add drop an the tail to qdiscs.

Yes,

> 
>    I mean major because we are not just talking about adding the method
> dequeue_tail() to struct Qdisc and adding dequeue_tail() to all qdiscs. One

I think it is. :-)

> will need to come up with definitions of dequeue_tail() for qdiscs that
> don't naturally have it and even rewrite the data structures of qdiscs. To
> substantiate this last point, consider sch_fifo.c, one of the simplest
> qdiscs available. sch_fifo.c keeps its packets in sch->q, which is of type
> struct qdisc_skb_head. struct qdisc_skb_head doesn't set skb->prev, so it
> cannot drop at the tail without walking through its list.

Yes but this would only be needed for the qdiscs that you want to
support with this flag. Nobody said you need to implement it on all
qdiscs that we have...

> 
>    I do understand the motivation for minimizing duplicate code. But the
> small amount of duplicate code that skbprio adds is cheaper than refactoring
> the scheduler system to only support this new sch_prio.c.

I'm afraid that without the code for option (2) above, this discussion
will become subjective. I'll wait for other opinions here.

Cheers,
  Marcelo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ