linux-kernel - Re: 2.6.21-rc3-mm1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 15 Mar 2007 10:16:25 +0000
From:	mel@...net.ie (Mel Gorman)
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Mariusz Kozlowski <m.kozlowski@...land.pl>,
	linux-kernel@...r.kernel.org
Subject: Re: 2.6.21-rc3-mm1

On (14/03/07 17:07), Andrew Morton didst pronounce:
> > On Wed, 14 Mar 2007 20:06:02 +0100 Mariusz Kozlowski <m.kozlowski@...land.pl> wrote:
> > Hello,
> > 
> > 	Today after +- 24h of uptime I found some more page allocation
> > failures ('eth1: Can't allocate skb for Rx'). You'll find more here:
> > 
> > http://tuxland.pl/misc/2.6.21-rc3-mm1-page-allocation-failure.txt
> > 
> > System wasn't doing anything unusual, as usual ;-) X, some p2p 
> > software, firefox+flash playing music.
> > 
> 
> Do other kernels do this, or is 2.6.21-rc3-mm1 worse?
> 
> It is of course a non-fatal problem and will inevitably happen sometimes,
> but we would like the VM to be able to minimise the occurrence of this
> problem.
> 

I'm looking at this now. In the vanilla allocator, min_free_kbytes has the
effect of keeping largest blocks free for as long as possible. This means
that if high-order allocations are rare, they'll tend to succeed up to a point.

In the case of the earlier report on HID failing the order-5 allocation, it's
because we didn't reclaim for long enough because the order was too high. It
would need to reclaim for quite a long time before it would get an order-5
block free. That is what Andy's lumpy-reclaim patches address - reclaiming
intelligently when we know it has a chance of succeeding instead of giving up.

> I think we were rather hoping that Mel's anti-fragmentation work would
> improve things.

I'm still optimistic it will. This is the widest it's been tested so far
so there were going to be bases

In this case, the order is high but GFP_ATOMIC. These allocations get grouped
together but with a low min_free_kbytes, as is the case on this machine,
the high-order atomic blocks eventually get used by others. I had assumed
that if high-order atomic allocations were important (e.g. e1000) that the
machine would be configured with min_free_kbytes of at least 16384 i.e.
1 MAX_ORDER_NR_PAGES per MIGRATE_TYPE.

Mariusz, I would be interested in finding out if this problem still occurs when
you set min_free_kbytes to 16384 via /proc/sys/vm/min_free_kbytes. I understand
that the problem is not easily reproduced and requiring configuration changes
is far from ideal but it'd allow me to find out if options 2 or 3 below make
sense in advance.

I'm looking at three ways of addressing this;

1. Anti-fragmentation currently favours breaking larger blocks to
   group-by-mobility as opposed to the vanilla allocator which always favours
   using the smallest possible block no matter where it is.  I'm looking
   at preserving the behaviour of the vanilla allocator to keeping larger
   blocks free as much as possible to see how adverse an effect that
   subsequently has on fragmentation.

2. Set min_free_kbytes higher automatically at boot time when
   CONFIG_PAGE_GROUP_BY_MOBILITY is set.

3. Disable PAGE_GROUP_BY_MOBILITY when min_free_kbytes is set too low
   and print KERN_INFO messages when it's disabled or enabled based on
   min_free_kbytes

I'm testing patches for 1 at the moment even though I think it will
lower success rates for superpage allocations because of increased
fragmentation. Benchmarking will tell me for sure.

Thanks Mariusz for the report.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/