lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 25 Mar 2015 23:16:54 +0100
From:	Vlastimil Babka <vbabka@...e.cz>
To:	Gioh Kim <gioh.kim@....com>, akpm@...ux-foundation.org,
	mgorman@...e.de, riel@...hat.com, hannes@...xchg.org,
	rientjes@...gle.com, vdavydov@...allels.com, iamjoonsoo.kim@....com
CC:	linux-mm@...ck.org, linux-kernel@...r.kernel.org, gunho.lee@....com
Subject: Re: [RFCv2] mm: page allocation for less fragmentation

On 25.3.2015 3:39, Gioh Kim wrote:
> My driver allocates more than 40MB pages via alloc_page() at a time and
> maps them at virtual address. Totally it uses 300~400MB pages.
> 
> If I run a heavy load test for a few days in 1GB memory system, I cannot allocate even order=3 pages
> because-of the external fragmentation.
> 
> I thought I needed a anti-fragmentation solution for my driver.
> But there is no allocation function that considers fragmentation.
> The compaction is not helpful because it is only for movable pages, not unmovable pages.
> 
> This patch proposes a allocation function allocates only pages in the same pageblock.
> 
> I tested this patch like following:
> 
> 1. When the driver allocates about 400MB and do "cat /proc/pagetypeinfo;cat /proc/buddyinfo"
> 
> Free pages count per migrate type at order       0      1      2      3      4      5      6      7      8      9     10
> Node    0, zone   Normal, type    Unmovable   3864    728    394    216    129     47     18      9      1      0      0
> Node    0, zone   Normal, type  Reclaimable    902     96     68     17      3      0      1      0      0      0      0
> Node    0, zone   Normal, type      Movable   5146    663    178     91     43     16      4      0      0      0      0
> Node    0, zone   Normal, type      Reserve      1      4      6      6      2      1      1      1      0      1      1
> Node    0, zone   Normal, type          CMA      0      0      0      0      0      0      0      0      0      0      0
> Node    0, zone   Normal, type      Isolate      0      0      0      0      0      0      0      0      0      0      0
> 
> Number of blocks type     Unmovable  Reclaimable      Movable      Reserve          CMA      Isolate
> Node 0, zone   Normal          135            3          124            2            0            0
> Node 0, zone   Normal   9880   1489    647    332    177     64     24     10      1      1      1
> 
> 2. The driver frees all pages and allocates pages again with alloc_pages_compact.

This is not a good test setup. You shouldn't switch the allocation types during
single system boot. You should compare results from a boot where common
allocation is used and from a boot where your new allocation is used.

> This is a kind of compaction of the driver.
> Following is the result of "cat /proc/pagetypeinfo;cat /proc/buddyinfo"
> 
> Free pages count per migrate type at order       0      1      2      3      4      5      6      7      8      9     10
> Node    0, zone   Normal, type    Unmovable      8      5      1    432    272     91     37     11      1      0      0
> Node    0, zone   Normal, type  Reclaimable    901     96     68     17      3      0      1      0      0      0      0
> Node    0, zone   Normal, type      Movable   4790    776    192     91     43     16      4      0      0      0      0
> Node    0, zone   Normal, type      Reserve      1      4      6      6      2      1      1      1      0      1      1
> Node    0, zone   Normal, type          CMA      0      0      0      0      0      0      0      0      0      0      0
> Node    0, zone   Normal, type      Isolate      0      0      0      0      0      0      0      0      0      0      0
> 
> Number of blocks type     Unmovable  Reclaimable      Movable      Reserve          CMA      Isolate
> Node 0, zone   Normal          135            3          124            2            0            0
> Node 0, zone   Normal   5693    877    266    544    320    108     43     12      1      1      1

The number of unmovable pageblocks didn't change here. The stats for free
unmovable pages does look better for higher orders than in the first listing
above, but even the common allocation logic would give you that result, if you
allocated your 400 MB using (many) order-0 allocations (since you apparently
don't care about physically contiguous memory). That would also prefer order-0
free pages before splitting higher orders. So this doesn't demonstrate benefits
of the alloc_pages_compact() approach I'm afraid. The results suggest that the
system was in a worst state when the first allocation happened, and meanwhile
some pages were freed, creating the large numbers of order-0 unmovable free
pages. Or maybe the system got fragmented in the first allocation because your
driver tries to allocate the memory with high-order allocations before falling
back to lower orders? That would probably defeat the natural anti-fragmentation
of the buddy system.

So a proper test could be based on this:

> If I run a heavy load test for a few days in 1GB memory system, I cannot
allocate even order=3 pages
> because-of the external fragmentation.

With this patch, is the situation quantifiably better? Can you post the
pagetype/buddyinfo for system boot where all driver allocations use the common
allocator, and system boot with the patch? That should be comparable if the
workload is the same for both boots.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ