[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4D9D8FAA.9080405@suse.cz>
Date: Thu, 07 Apr 2011 12:19:22 +0200
From: Jiri Slaby <jslaby@...e.cz>
To: azurIt <azurit@...ox.sk>
CC: linux-kernel@...r.kernel.org, Changli Gao <xiaosuo@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
Eric Dumazet <eric.dumazet@...il.com>,
linux-fsdevel@...r.kernel.org, Jiri Slaby <jirislaby@...il.com>
Subject: Re: Regression from 2.6.36
Cced few people.
Also the series which introduced this were discussed at:
http://lkml.org/lkml/2010/5/3/53
On 04/07/2011 12:01 PM, azurIt wrote:
>
> I have finally completed bisection, here are the results:
>
>
>
> a892e2d7dcdfa6c76e60c50a8c7385c65587a2a6 is first bad commit
> commit a892e2d7dcdfa6c76e60c50a8c7385c65587a2a6
> Author: Changli Gao <xiaosuo@...il.com>
> Date: Tue Aug 10 18:01:35 2010 -0700
>
> vfs: use kmalloc() to allocate fdmem if possible
>
> Use kmalloc() to allocate fdmem if possible.
>
> vmalloc() is used as a fallback solution for fdmem allocation. A new
> helper function __free_fdtable() is introduced to reduce the lines of
> code.
>
> A potential bug, vfree() a memory allocated by kmalloc(), is fixed.
>
> [akpm@...ux-foundation.org: use __GFP_NOWARN, uninline alloc_fdmem() and free_fdmem()]
> Signed-off-by: Changli Gao <xiaosuo@...il.com>
> Cc: Alexander Viro <viro@...iv.linux.org.uk>
> Cc: Jiri Slaby <jslaby@...e.cz>
> Cc: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
> Cc: Alexey Dobriyan <adobriyan@...il.com>
> Cc: Ingo Molnar <mingo@...e.hu>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Avi Kivity <avi@...hat.com>
> Cc: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
> Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds@...ux-foundation.org>
>
> :040000 040000 a7b3997bc754f573b4a309cda1a0774ea95c235e 4241a4f2115c60e5c1dc1879c85c9911fa077807 M fs
>
>
>
>
>
>
> ______________________________________________________________
> > Od: "Greg KH" <greg@...ah.com>
> > Komu: azurIt <azurit@...ox.sk>
> > Dátum: 17.03.2011 01:15
> > Predmet: Re: Regression from 2.6.36
> >
> > CC: linux-kernel@...r.kernel.org On Tue, Mar 15, 2011 at 02:25:27PM +0100, azurIt wrote:
> >
> > Hi,
> >
> > we are successfully running several very busy web servers on 2.6.32.* and
> > few days ago I decided to upgrade to 2.6.37 (mainly because of blkio cgroup).
> > I installed 2.6.37.2 on one of the servers and very strange things started to
> > happen with Apache web server.
> >
> > We are using Apache with MPM-ITK ( http://mpm-itk.sesse.net/ ) so it is doing
> > lots of 'fork' and lots of 'setuid'. I have also noticed that problem is
> > happening only on very busy servers.
> >
> > Everything is ok when Apache is started but as time is passing by, its 'root'
> > processes (Apache processes running under root) are consuming more and more CPU.
> > Finally, the whole server becames very unstable and Apache must be restarted.
> > This is repeating until the load on web sites is much lower (usually on 22:00).
> > Sometimes it takes 3 hours when restart is needed, sometimes only 1 hour (again,
> > depends on load on web sites). Here is the graph of CPU utilization showing the
> > problem (red color), Apache was REstarted at 8:11 and 9:35:
> > http://watchdog.sk/lkml/cpu-problem.png
> >
> > Here is how it looks on htop:
> > http://watchdog.sk/lkml/htop.jpg
> >
> > And finally here is how it looks with older kernels (yes, when i install older
> > kernel, problem is gone), notice also that I/O wait is much lower and nicer
> > (blue color):
> > http://watchdog.sk/lkml/cpu-ok.png
> >
> > I was also strace-ing Apache processes which were doing problems, here it is:
> > http://watchdog.sk/lkml/strace.txt
> >
> > I'm not 100% sure but I think that CPU was consumed on 'futex' lines.
> >
> > I tried several kernel versions and find out that everything BEFORE 2.6.36 is
> > NOT affected and everything AFTER 2.6.36 (included) is affected.
> >
> > Versions which I tried and were NOT affected by this problem:
> > 2.6.32.*
> > 2.6.35.11
> >
> > Versions which I tried and were affected by this problem:
> > 2.6.36
> > 2.6.36.4
> > 2.6.37.2
> > 2.6.37.3
> > 2.6.38-rc8 (final version was not released yet)
> >
> > All tests were made on vanilla kernels on Debian Lenny with this config:
> > http://watchdog.sk/lkml/config
> >
> > Do you need any other information from me ? I'm able to try other versions or
> > patches but, please, take into account that I have to do this on _production_
> > server (I failed to reproduce it in testing environment). Also, I'm able to try
> > only one kernel per day.
>
> Ick, one kernel per day might make this a bit difficult, but if there
> was any way you could use 'git bisect' to try to narrow this down to the
> patch that caused this problem, it would be great.
>
> You can mark 2.6.35 as working and 2.6.36 as bad and git will go from
> there and try to offer you different chances to find the problem.
>
> thanks,
>
> greg k-h
thanks,
--
js
suse labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists