linux-kernel - Re: [PATCH] fix page_alloc for larger I/O segments (improved)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20071214174236.GA28613@csn.ul.ie>
Date:	Fri, 14 Dec 2007 17:42:37 +0000
From:	Mel Gorman <mel@....ul.ie>
To:	Mark Lord <liml@....ca>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	James.Bottomley@...senPartnership.com, jens.axboe@...cle.com,
	lkml@....ca, matthew@....cx, linux-ide@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-scsi@...r.kernel.org,
	linux-mm@...ck.org
Subject: Re: [PATCH] fix page_alloc for larger I/O segments (improved)

On (13/12/07 19:46), Mark Lord didst pronounce:
> 
> "Improved version", more similar to the 2.6.23 code:
> 
> Fix page allocator to give better chance of larger contiguous segments 
> (again).
> 
> Signed-off-by: Mark Lord <mlord@...ox.com

Regrettably this interferes with anti-fragmentation because the "next" page
on the list on return from rmqueue_bulk is not guaranteed to be of the right
mobility type. I fixed it as an additional patch but it adds additional cost
that should not be necessary and it's visible in microbenchmark results on
at least one machine.

The following patch should fix the page ordering problem without incurring an
additional cost or interfering with anti-fragmentation. However, I haven't
anything in place yet to verify that the physical page ordering is correct
but it makes sense. Can you verify it fixes the problem please?

It'll still be some time before I have a full set of performance results
but initially at least, this fix seems to avoid any impact.

======
Subject: Fix page allocation for larger I/O segments

In some cases the IO subsystem is able to merge requests if the pages are
adjacent in physical memory. This was achieved in the allocator by having
expand() return pages in physically contiguous order in situations were
a large buddy was split. However, list-based anti-fragmentation changed
the order pages were returned in to avoid searching in buffered_rmqueue()
for a page of the appropriate migrate type.

This patch restores behaviour of rmqueue_bulk() preserving the physical order
of pages returned by the allocator without incurring increased search costs for
anti-fragmentation.

Signed-off-by: Mel Gorman <mel@....ul.ie>
--- 
 page_alloc.c |   11 +++++++++++
 1 file changed, 11 insertions(+)

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.24-rc5-clean/mm/page_alloc.c linux-2.6.24-rc5-giveback-physorder-listmove/mm/page_alloc.c
--- linux-2.6.24-rc5-clean/mm/page_alloc.c	2007-12-14 11:55:13.000000000 +0000
+++ linux-2.6.24-rc5-giveback-physorder-listmove/mm/page_alloc.c	2007-12-14 15:33:12.000000000 +0000
@@ -847,8 +847,19 @@ static int rmqueue_bulk(struct zone *zon
 		struct page *page = __rmqueue(zone, order, migratetype);
 		if (unlikely(page == NULL))
 			break;
+
+		/*
+		 * Split buddy pages returned by expand() are received here
+		 * in physical page order. The page is added to the callers and
+		 * list and the list head then moves forward. From the callers
+		 * perspective, the linked list is ordered by page number in
+		 * some conditions. This is useful for IO devices that can
+		 * merge IO requests if the physical pages are ordered
+		 * properly.
+		 */
 		list_add(&page->lru, list);
 		set_page_private(page, migratetype);
+		list = &page->lru;
 	}
 	spin_unlock(&zone->lock);
 	return i;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/