[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <AANLkTikZ7b_nvmm1kjHgP9n-mh6J1XH0QZQvCn-eaojz@mail.gmail.com>
Date:	Sun, 27 Mar 2011 21:43:33 -0700
From:	Maciej Żenczykowski <zenczykowski@...il.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Linux NetDev <netdev@...r.kernel.org>,
	David Miller <davem@...emloft.net>
Subject: Re: On Linux rate limiting and the magic value of 34.64 Gbps...
2011/3/25 Eric Dumazet <eric.dumazet@...il.com>:
> Le vendredi 25 mars 2011 à 00:14 -0700, Maciej Żenczykowski a écrit :
>> Hey,
>>
>> The Linux rate limiting code relies on the rate field of struct tc_ratespec.
>> This field is a __u32 and measures rate in "bytes per second".
>>
>> This basically means maximum representable rate is 4GB per second.
>> This is equivalent to 34.36 Gbps and I just ran across that limit with
>> 40 Gbps (which behaves like 5.64 Gbps because of overflow/truncation).
>> Seeing as this structure is exposed to userspace for both read and
>> write via various netlink paths (in cbq, htb, tbf, etc...) there
>> doesn't seem to be a particularly clean way to increase the size of
>> this field.  While there is a __reserved field that could
>> theoretically be repurposed as some sort of rate bit shift register, I
>> don't think we can rely on __reserved having been set to zero by
>> userspace (by older programs), and we will definitely see problems
>> with output by programs (tc) that don't expect to have to parse this
>> field to output an understandable rate limit...
>>
>> Anybody have any bright ideas?
>
> Well, netlink is extensible, so we can easily add a new structure, with
> 64bit fields if necessary.
>
> We did that for 64bit stats already.
I assume you are referring to:
commit 10708f37ae729baba9b67bd134c3720709d4ae62
Author: Jan Engelhardt <jengelh@...ozas.de>
Date:   Thu Mar 11 09:57:29 2010 +0000
    net: core: add IFLA_STATS64 support
    `ip -s link` shows interface counters truncated to 32 bit. This is
    because interface statistics are transported only in 32-bit quantity
    to userspace. This commit adds a new IFLA_STATS64 attribute that
    exports them in full 64 bit.
    References: http://lkml.indiana.edu/hypermail/linux/kernel/0307.3/0215.html
    Signed-off-by: Jan Engelhardt <jengelh@...ozas.de>
    Signed-off-by: David S. Miller <davem@...emloft.net>
include/linux/if_link.h
net/core/rtnetlink.c
That commit is relatively simple, since the statistics structure is
only ever exported by the kernel and is only exported in one location.
Here the situation is significantly more complex, we both export and
import this structure from userspace.  And we do so in many different
locations.
((v2.6.38))$ egrep -r 'qdisc_(get|put)_rtab' . | wc -l
36
It looks like the amount of backward compatibility code would be have
to be quite large.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists
 
