lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110106205549.0de56de1@nehalam>
Date:	Thu, 6 Jan 2011 20:55:49 -0800
From:	Stephen Hemminger <shemminger@...tta.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: [RFC] sched: CHOKe packet scheduler (v0.2)

On Thu, 06 Jan 2011 05:07:30 +0100
Eric Dumazet <eric.dumazet@...il.com> wrote:

> Le mercredi 05 janvier 2011 à 11:21 -0800, Stephen Hemminger a écrit :
> > This implements the CHOKe packet scheduler based on the existing
> > Linux RED scheduler based on the algorithm described in the paper.
> > 
> > The core idea is:
> >   For every packet arrival:
> >   	Calculate Qave
> > 	if (Qave < minth) 
> > 	     Queue the new packet
> > 	else 
> > 	     Select randomly a packet from the queue 
> > 	     if (both packets from same flow)
> > 	     then Drop both the packets
> > 	     else if (Qave > maxth)
> > 	          Drop packet
> > 	     else
> > 	       	  Admit packet with probability p (same as RED)
> > 
> > See also:
> >   Rong Pan, Balaji Prabhakar, Konstantinos Psounis, "CHOKe: a stateless active
> >    queue management scheme for approximating fair bandwidth allocation", 
> >   Proceeding of INFOCOM'2000, March 2000. 
> > 
> > Signed-off-by: Stephen Hemminger <shemminger@...tta.com>
> > 
> 
> To be really useful in a wide range of environments, I believe that :
> 
> - CHOKe should be able to use an external flow classifier (like say...
> SFQ) to compute a token and compare two skbs by this token instead of
> custom rxhash or whatever. (rxhash can be the default in absence of flow
> classifier). Probably you need to store the token in skb->cb[] to avoid
> calling tc_classify() several times for a given packet. 
> 
> http://lwn.net/Articles/236200/
> http://kerneltrap.org/mailarchive/linux-netdev/2008/1/31/667679

Probably should split SFQ flow hash stuff into core code for reuse.


> - Must use a FIFO with O(1) access to Nth skb in queue.
>  
>  A linked list makes this implementation too expensive for big queues.
> 
> For small queues (less than 128 skbs at this moment for SFQ), existing
> schedulers are good enough.
> 
> CHOKe authors dont mention this in their paper, but their experiments
> were done in 1999 with 1Mbs links. minth=100 and maxth=200, limit=300
> 
> We want to try CHOKe with modern links, probably with minth=2000 and
> maxth=4000 or more.
> 
> They said "It is arguably more difficult to drop a randomy chosen packet
> since this means removing from a linked-list. Instead of doing this, we
> propose to add one extra bit to the packet header. The bit is set to one
> if the drop candidate is to be dropped. When a packet advance to the
> head of the FIFO buffer, the status of the bit determines whether it is
> to be immediately discarded or transmitted on the outgoind line"
> 
> If they thought removing a buffer from a linked list was expensive
> (!!!), they certainly assumed the previous access to the randomly chosen
> buffer was faster than the skb unlink !
> 
> Using a circular buffer should be enough, using a similar trick than
> suggested : when droping an skb from the ring, stick a NULL pointer and
> dont memmove() the window to shrink it.
> 
> struct skb_ring {
> 	unsigned int head;
> 	unsigned int tail;
> 	unsigned int size; /* a power of two */
> 	struct sk_buff **table;
> };
> 
> Doing so avoids the cache misses to adjacent skbs prev/next when
> queue/dequeue is done.

The problem is that large tables of pointers in kernel require either
contiguous allocation or some indirect table algorithm.

-- 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ