[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b30d1c3b0911031945l1e46aa53wee51579ef908c744@mail.gmail.com>
Date: Wed, 4 Nov 2009 12:45:13 +0900
From: Ryousei Takano <ryousei@...il.com>
To: Stephen Hemminger <shemminger@...tta.com>
Cc: Patrick McHardy <kaber@...sh.net>,
Linux Netdev List <netdev@...r.kernel.org>,
takano-ryousei@...t.go.jp
Subject: Re: HTB accuracy on 10GbE
On Wed, Nov 4, 2009 at 12:13 PM, Ryousei Takano <ryousei@...il.com> wrote:
> Hi Patrick and Stephen,
>
> Thanks for your comments.
>
> I retried on the newer kernel and iproute2, and added the experimental result
> on my page. Please see 'Experimental result 2':
> http://code.google.com/p/pspacer/wiki/HTBon10GbE
>
> The accuracy improves compared with the previous experiment.
> The difference reduces from +810 Mbps to +430 Mbps.
> It is because the timer resolution improves from 1 usec to 1/64 usec.
> But it is not perfect.
>
Oops, not 1/64 usec but 1/16 usec.
> Best regards,
> Ryousei Takano
>
>
> On Tue, Nov 3, 2009 at 5:53 AM, Stephen Hemminger <shemminger@...tta.com> wrote:
>> On Mon, 02 Nov 2009 16:43:42 +0100
>> Patrick McHardy <kaber@...sh.net> wrote:
>>
>>> Ryousei Takano wrote:
>>> > Hi Stephen and all,
>>> >
>>> > I have observed a HTB accuracy problem on the Linux kernel 2.6.30 and
>>> > the Myri-10G 10 GbE NIC.
>>> > HTB can control the transmission rate at Gigabit speed, however it can
>>> > not work well at 10 Gigabit speed.
>>> >
>>> > I asked Stephen this problem at Japan Linux Symposium. He mentioned a
>>> > HTB bug related to the timer granularity.
>>> > I want to know what is happen, and what should be do for fixing it.
>>> >
>>> > Any comments and suggestions will be welcome.
>>> >
>>> > For more detail, please see the following page:
>>> > http://code.google.com/p/pspacer/wiki/HTBon10GbE
>>>
>>> This is not an easy problem to fix. Userspace, the kernel and the
>>> netlink API use 32 bit for timing related values, which is too small
>>> to use more than microsecond resolution. All of them need to be
>>> converted to use bigger types, additionally some kind of compatibility
>>> handling to deal with old iproute versions still using microsecond
>>> resolution is required.
>>
>> The existing API is a legacy mish-mash. The field is limited to 32 bits,
>> but it might be possible to use a finer scale.
>>
>> Maybe if kernel advertised finer resolution through /proc/net/psched
>> then table could be finer grained. This would maintain compatibility
>> between kernel and user space. You would need to have new kernel and
>> new iproute to get nanosecond resolution but older combinations would
>> still work.
>>
>> The downside is that by using nanosecond resolution the rates are upper
>> bounded at 4.2seconds / packet.
>>
>>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists