linux-kernel - Re: txqueuelen has wrong units; should be time

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110228134338.1241484mkljbz4w0@hayate.sektori.org>
Date:	Mon, 28 Feb 2011 13:43:38 +0200
From:	Jussi Kivilinna <jussi.kivilinna@...et.fi>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Albert Cahalan <acahalan@...il.com>,
	Mikael Abrahamsson <swmike@....pp.se>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	netdev@...r.kernel.org
Subject: Re: txqueuelen has wrong units; should be time

Quoting Eric Dumazet <eric.dumazet@...il.com>:

> Le dimanche 27 février 2011 à 12:55 +0200, Jussi Kivilinna a écrit :
>> Quoting Albert Cahalan <acahalan@...il.com>:
>>
>> > On Sun, Feb 27, 2011 at 2:54 AM, Eric Dumazet  
>> <eric.dumazet@...il.com> wrote:
>> >> Le dimanche 27 février 2011 à 08:02 +0100, Mikael Abrahamsson a écrit :
>> >>> On Sun, 27 Feb 2011, Albert Cahalan wrote:
>> >>>
>> >>> > Nanoseconds seems fine; it's unlikely you'd ever want
>> >>> > more than 4.2 seconds (32-bit unsigned) of queue.
>> > ...
>> >> Problem is some machines have slow High Resolution timing services.
>> >>
>> >> _If_ we have a time limit, it will probably use the low resolution (aka
>> >> jiffies), unless high resolution services are cheap.
>> >
>> > As long as that is totally internal to the kernel and never
>> > getting exposed by some API for setting the amount, sure.
>> >
>> >> I was thinking not having an absolute hard limit, but an EWMA based one.
>> >
>> > The whole point is to prevent stale packets, especially to prevent
>> > them from messing with TCP, so I really don't think so. I suppose
>> > you do get this to some extent via early drop.
>>
>> I made simple hack on sch_fifo with per packet time limits
>> (attachment) this weekend and have been doing limited testing on
>> wireless link. I think hardlimit is fine, it's simple and does
>> somewhat same as what packet(-hard)limited buffer does, drops packets
>> when buffer is 'full'. My hack checks for timed out packets on
>> enqueue, might be wrong approach (on other hand might allow some more
>> burstiness).
>>
>
>
> Qdisc should return to caller a good indication packet is queued or
> dropped at enqueue() time... not later (aka : never)
>
> Accepting a packet at t0, and dropping it later at t0+limit without
> giving any indication to caller is a problem.
>
> This is why I suggested using an EWMA plus a probabilist drop or
> congestion indication (NET_XMIT_CN) to caller at enqueue() time.
>
> The absolute time limit you are trying to implement should be checked at
> dequeue time, to cope with enqueue bursts or pauses on wire.
>

Would it be better to implement this as generic feature instead of  
qdisc specific? Have qdisc_enqueue_root do ewma check:

static inline int qdisc_enqueue_root(struct sk_buff *skb, struct Qdisc *sch)
{
	qdisc_skb_cb(skb)->pkt_len = skb->len;
	if (likely(!sch->use_timeout)) {
ewma_ok:
		return qdisc_enqueue(skb, sch) & NET_XMIT_MASK;
	}

	status = qdisc_check_ewma_status()
	if (status == ok)
		goto ewma_ok;

	if (status == overlimits)
		...drop...

	if (status == congestion) {
		ret = qdisc_enqueue(skb, sch) & NET_XMIT_MASK;
		return (ret == success) ? NET_XMIT_CN : ret;
	}
}

And add qdisc_dequeue_root:

static inline struct sk_buff *qdisc_dequeue_root(struct Qdisc *sch)
{
	skb = sch->dequeue(sch);

	if (skb && unlikely(sch->use_timeout))
		qdisc_update_ewma(skb);

	return skb;
}

Then user could specify any qdisc to use timeout or not with tc. Maybe  
go even as far as have some default timeout for default qdisc(?)

-Jussi


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/