lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110721164238.GA3326@barrios-desktop>
Date:	Fri, 22 Jul 2011 01:42:38 +0900
From:	Minchan Kim <minchan.kim@...il.com>
To:	Andrew Lutomirski <luto@....edu>
Cc:	Mel Gorman <mgorman@...e.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	P?draig Brady <P@...igbrady.com>,
	James Bottomley <James.Bottomley@...senpartnership.com>,
	Colin King <colin.king@...onical.com>,
	Rik van Riel <riel@...hat.com>,
	Johannes Weiner <hannes@...xchg.org>,
	linux-mm <linux-mm@...ck.org>,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 0/4] Stop kswapd consuming 100% CPU when highest zone is
 small

On Thu, Jul 21, 2011 at 12:36:11PM -0400, Andrew Lutomirski wrote:
> On Thu, Jul 21, 2011 at 12:24 PM, Minchan Kim <minchan.kim@...il.com> wrote:
> > On Thu, Jul 21, 2011 at 05:09:59PM +0100, Mel Gorman wrote:
> >> On Fri, Jul 22, 2011 at 12:37:22AM +0900, Minchan Kim wrote:
> >> > On Fri, Jun 24, 2011 at 03:44:53PM +0100, Mel Gorman wrote:
> >> > > (Built this time and passed a basic sniff-test.)
> >> > >
> >> > > During allocator-intensive workloads, kswapd will be woken frequently
> >> > > causing free memory to oscillate between the high and min watermark.
> >> > > This is expected behaviour.  Unfortunately, if the highest zone is
> >> > > small, a problem occurs.
> >> > >
> >> > > This seems to happen most with recent sandybridge laptops but it's
> >> > > probably a co-incidence as some of these laptops just happen to have
> >> > > a small Normal zone. The reproduction case is almost always during
> >> > > copying large files that kswapd pegs at 100% CPU until the file is
> >> > > deleted or cache is dropped.
> >> > >
> >> > > The problem is mostly down to sleeping_prematurely() keeping kswapd
> >> > > awake when the highest zone is small and unreclaimable and compounded
> >> > > by the fact we shrink slabs even when not shrinking zones causing a lot
> >> > > of time to be spent in shrinkers and a lot of memory to be reclaimed.
> >> > >
> >> > > Patch 1 corrects sleeping_prematurely to check the zones matching
> >> > >   the classzone_idx instead of all zones.
> >> > >
> >> > > Patch 2 avoids shrinking slab when we are not shrinking a zone.
> >> > >
> >> > > Patch 3 notes that sleeping_prematurely is checking lower zones against
> >> > >   a high classzone which is not what allocators or balance_pgdat()
> >> > >   is doing leading to an artifical believe that kswapd should be
> >> > >   still awake.
> >> > >
> >> > > Patch 4 notes that when balance_pgdat() gives up on a high zone that the
> >> > >   decision is not communicated to sleeping_prematurely()
> >> > >
> >> > > This problem affects 2.6.38.8 for certain and is expected to affect
> >> > > 2.6.39 and 3.0-rc4 as well. If accepted, they need to go to -stable
> >> > > to be picked up by distros and this series is against 3.0-rc4. I've
> >> > > cc'd people that reported similar problems recently to see if they
> >> > > still suffer from the problem and if this fixes it.
> >> > >
> >> >
> >> > Good!
> >> > This patch solved the problem.
> >> > But there is still a mystery.
> >> >
> >> > In log, we could see excessive shrink_slab calls.
> >>
> >> Yes, because shrink_slab() was called on each loop through
> >> balance_pgdat() even if the zone was balanced.
> >>
> >>
> >> > And as you know, we had merged patch which adds cond_resched where last of the function
> >> > in shrink_slab. So other task should get the CPU and we should not see
> >> > 100% CPU of kswapd, I think.
> >> >
> >>
> >> cond_resched() is not a substitute for going to sleep.
> >
> > Of course, it's not equal with sleep but other task should get CPU and conusme their time slice
> > So we should never see 100% CPU consumption of kswapd.
> > No?
> 
> If the rest of the system is idle, then kswapd will happily use 100%
> CPU.  (Or on a multi-core system, kswapd will use close to 100% of one

Of course. But at least, we have a test program and I think it's not idle.

> CPU even if another task is using the other one.  This is bad enough
> on a desktop, but on a laptop you start to notice when your battery

Of course it's bad. :)
What I want to know is just what's exact cause of 100% CPU usage.
It might be not 100% but we might use the word sloppily.

> dies.)
> 
> --Andy
> 
> >
> >>
> >> --
> >> Mel Gorman
> >> SUSE Labs
> >
> > --
> > Kind regards,
> > Minchan Kim
> >

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ