[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <46541DC4.4090501@trash.net>
Date: Wed, 23 May 2007 12:56:04 +0200
From: Patrick McHardy <kaber@...sh.net>
To: Ingo Molnar <mingo@...e.hu>
CC: Anant Nitya <kernel@...chanda.info>, linux-kernel@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
"David S. Miller" <davem@...emloft.net>,
Linux Netdev List <netdev@...r.kernel.org>,
Herbert Xu <herbert@...dor.apana.org.au>
Subject: Re: bad networking related lag in v2.6.22-rc2
Ingo Molnar wrote:
> if you feel inclined to try the git-bisection then by all means please
> do it (it will certainly be helpful and educative), but it's optional: i
> dont think you should 'need' to go through extra debugging chores, my
> analysis based on the excellent trace you provided still holds and
> whoever modified htb_dequeue()'s logic recently ought to be able to
> figure that out (or send you a debug patch to further narrow the problem
> down).
>
> The trace shows a _clearly_ anomalous loop: for example there's 56396
> (!) calls to rb_first() in htb_dequeue() [without the kernel ever
> exiting that function]:
>
> earth4:~/s> grep rb_first trace-to-ingo.txt | wc -l
> 56396
How is this trace to be understood? Is it simply a call trace in
execution-order? If thats the case than we are exiting htb_dequeue,
each call to qdisc_watchdog_schedule happens at the very end of
that function, which would imply a bug in __qdisc_run.
Looking at the recent changes to __qdisc_run, this indeed seems
to be the case, when the qdisc is throttled and has packets queued
we return a value != 0, causing __qdisc_run to loop until all
packets have been sent, which may be a long time.
Anant, can you please verify by testing the attached patch? Thanks.
View attachment "x" of type "text/plain" (319 bytes)
Powered by blists - more mailing lists