lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1300169749.2649.142.camel@edumazet-laptop>
Date:	Tue, 15 Mar 2011 07:15:49 +0100
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Dave Täht <d@...t.net>
Cc:	Jonathan Morton <chromatix99@...il.com>,
	David Miller <davem@...emloft.net>,
	netdev <netdev@...r.kernel.org>
Subject: Re: ECN + pfifo_fast borked? (Was Re: [Bloat] shaper team forming
 up)

Le lundi 14 mars 2011 à 23:27 -0600, Dave Täht a écrit :

> Well, that makes 3 of us that think it's wrong. Can we get more? 
> 
> (I'll run through the math again in the morning)
> 
> It's most often not actually "enablement" but "assertion", when for
> example an ECN bit is put on an ACK packet (by an application, or qdisc)
> , it drops that ACK packet into the 2 queue - leaving all the other
> non-ECN asserted packets in that flow to flow out ahead of it.
> 

There are two ECN bits, not one.
The low order bit is not taken into account by skb->priority mapping.
The high order bit cannot be changed during flow lifetime.
(So : no OOO (Out Of Order) problems on say TCP flows)

> Or so dan siemon & I & now you, think. It's late and I really want to recheck
> the math and the shifts in the morning. However, if true... this would
> explain much ECN related weirdness precisely where it has been hard to
> measure, on heavily loaded systems.
> 
> >
> > Thanks
> >
> > diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> > index 6ed6603..fabfe81 100644
> > --- a/net/ipv4/route.c
> > +++ b/net/ipv4/route.c
> > @@ -171,7 +171,7 @@ static struct dst_ops ipv4_dst_ops = {
> >  
> >  const __u8 ip_tos2prio[16] = {
> >  	TC_PRIO_BESTEFFORT,
> > -	ECN_OR_COST(FILLER),
> > +	ECN_OR_COST(BESTEFFORT),
> >  	TC_PRIO_BESTEFFORT,
> >  	ECN_OR_COST(BESTEFFORT),
> >  	TC_PRIO_BULK,
> >
> 
> I think this is a good short term fix, but it will mildly upset people
> that actually still use minimum cost and don't use ECN. That said, 
> RFC1349 has been obsolete for a decade now, and ECN enabled servers are
> at 12% penetration according to MIT.
> 

If minimum cost was asked by people, their packets had chance being
dropped. Why should they be upset ?
ECN should be favored anyway in 2011, now everybody is ready.

> Still, long term, doing a sch_pfifo_dscp that would be fully compliant
> with the relevant modern RFCs and eventually making that the standard
> would be good.
> 

sch_pfifo_fast is not the place we perform the TOS -> priority mapping.

Its done in another layer.

That is of litle effect, given TOS values are meaningfull only inside a
domain. Nobody can force everyone to use same semantics on the Internet,
even with a standard RFC. I doubt people using linux machines at home
really need DSCP at all.

What we could do instead is to favor a bit ECN enabled connections,
using 4 bands instead of 3 for pfifo_fast (linux default qdisc, probably
the most used qdisc)

band 0 : high priority packets (like now)
band 1 : (old band 1, ECN capable flows)
band 2 : (old band 1, no ECN flows)
band 3 : low priority packets (old band 2)

Note : pfifo_fast is mostly used on end hosts, not on routers (where
admins setup non default qdiscs), and typical end hosts never experiment
packet drops on their qdiscs, because they are now plugged to Gigabit
LANS, and device queuelength is so big.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ