[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTikwnRPzGE2+Kw=4+TbWm7JknM4qPdz9BDCUuSnA@mail.gmail.com>
Date: Wed, 29 Sep 2010 13:33:39 +0200
From: Miguel Ojeda <miguel.ojeda.sandonis@...il.com>
To: tmhikaru@...il.com
Cc: Florian Mickler <florian@...kler.org>,
linux-kernel@...r.kernel.org, Greg KH <gregkh@...e.de>
Subject: Re: Linux 2.6.35.6
On Wed, Sep 29, 2010 at 1:02 PM, <tmhikaru@...il.com> wrote:
> On Wed, Sep 29, 2010 at 09:29:24AM +0200, Florian Mickler wrote:
>> Do you know what load average conky is showing you? If I
>> type 'uptime' on a console, i get three load numbers: 1minute-,
>> 5minutes- and 15minutes-average.
>> If there is a systematic bias it should be visible on the
>> 15minutes-average. If there are only bursts of 'load' it should be
>> visible on the 1 minutes average numbers.
>
> It is giving the same averages that uptime does in the same format, and
> there is a routine problem - it remains high on all averages on the kernels
> that do not work properly, and zeroes eventually if I leave it alone long
> enough on kernels that do work properly. When I discovered X was part of the
> problem somehow, it was due to me testing in X with mrxvt running bash and
> uptime, and in console without X using bash with uptime. uptime consistently
> gives the same numbers that conky does, so I don't think I need to worry
> about conky confusing the issue.
>
>>
>> But it doesn't really matter for now what kind of load disturbance you
>> are seeing, because you actually have a better way to distinguish a good
>> kernel from a bad:
>
> You may think a timed kernel compile is a better way to determine if there
> is a fault with the kernel, but it takes my machine around two hours (WITH
> ccache) to build my kernel. Since the use of ccache speeds up the builds
> dramatically and would give misleading readings if I compiled the exact
> kernel source twice, I'd have to disable it if I wanted it to be a
> worthwhile test. So it would take even *longer* to build than normally. This
> is not something I'm willing to use as a 'better' test - especially since
> the loadavg numbers are consistently high when on a bad kernel and
> consistently zeroed or very close to it when not.
>
> Here's an uptime sample from a working version:
>
> 06:20:31 up 21 min, 4 users, load average: 0.00, 0.02, 0.06
>
> I've been typing up this email while waiting for the load to flatten from
> the initial boot. I think it's pretty obvious here that it's working
> properly, so I'm going to git bisect good it...
>
> Bisecting: 27 revisions left to test after this (roughly 5 steps)
>
> I'm getting fairly close at least.
>
> Here's an uptime output from a version of the kernel that was NOT working
> properly, 2.6.35.6:
> 14:30:12 up 3:46, 4 users, load average: 0.85, 0.93, 0.89
>
> And it probably doesn't give you any useful information, but here's 2.6.35.1's
> reaction to building 2.6.35:
> 22:01:22 up 15 min, 4 users, load average: 1.84, 1.38, 0.83
>
> whereas on a working kernel this is what the load average looks like when
> building a kernel:
> 06:33:13 up 34 min, 4 users, load average: 1.01, 0.92, 0.52
>
> This is not a multiprocessor or multicore system, it's an athlon XP 2800
> with 1.5GB of ram. Before the question is asked, no, I'm not being silly and
> using make -j2.
>
> I think simply letting the machine idle is just as good a test for
> determining wether or not any particular kernel is good/bad since the
> readings are like night and day. I only brought up that the timed kernel
> runs were taking longer on the kernel with the higher load average since it
> meant that it wasn't simply a broken statistic giving false readings;
> something *is* wrong, and I can't simply ignore it.
>
> It's taken me several days to bisect this far. If greg insists, I'll restart
> the bisection from scratch using a kernel compile as a test, but I implore
> you not to ask me to do so; it will more than likely give me the same
> results I'm getting now for more than double the amount of time invested.
>
>> Yes, the sample rate was one of the things I wanted to know, but also which of
>> the 3 load figures you were graphing.
> To be honest, I actually don't know. I'm *terrible* at regex, this is what
> the bash script is doing:
>
> cat /proc/loadavg | perl -p -e 's/^([^ ]+) .+$/$1/'
>
> If you can explain what that's doing, I'd appreciate it. If it's not to your
> liking, I can change it to something else.
You are taking the average over the last minute.
>
>
> Tim McGrath
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists