linux-kernel - Re: [RFC PATCH v2 0/4] mm: reclaim zbud pages on migration and compaction

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130812022535.GA18832@bbox>
Date:	Mon, 12 Aug 2013 11:25:35 +0900
From:	Minchan Kim <minchan@...nel.org>
To:	Krzysztof Kozlowski <k.kozlowski@...sung.com>
Cc:	Seth Jennings <sjenning@...ux.vnet.ibm.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mel Gorman <mgorman@...e.de>,
	Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>,
	Marek Szyprowski <m.szyprowski@...sung.com>,
	Kyungmin Park <kyungmin.park@...sung.com>,
	Dave Hansen <dave.hansen@...el.com>, guz.fnst@...fujitsu.com,
	bcrl@...ck.org
Subject: Re: [RFC PATCH v2 0/4] mm: reclaim zbud pages on migration and
 compaction

Hello,

On Fri, Aug 09, 2013 at 12:22:16PM +0200, Krzysztof Kozlowski wrote:
> Hi,
> 
> Currently zbud pages are not movable and they cannot be allocated from CMA
> region. These patches try to address the problem by:

The zcache, zram and GUP pages for memory-hotplug and/or CMA are
same situation.

> 1. Adding a new form of reclaim of zbud pages.
> 2. Reclaiming zbud pages during migration and compaction.
> 3. Allocating zbud pages with __GFP_RECLAIMABLE flag.

So I'd like to solve it with general approach.

Each subsystem or GUP caller who want to pin pages long time should
create own migration handler and register the page into pin-page
control subsystem like this.

driver/foo.c

int foo_migrate(struct page *page, void *private);

static struct pin_page_owner foo_migrate = {
        .migrate = foo_migrate;
};

int foo_allocate()
{
        struct page *newpage = alloc_pages();
        set_pinned_page(newpage, &foo_migrate);
}

And in compaction.c or somewhere where want to move/reclaim the page,
general VM can ask to owner if it founds it's pinned page.

mm/compaction.c

        if (PagePinned(page)) {
                struct pin_page_info *info = get_page_pin_info(page);
                info->migrate(page);
                
        }

Only hurdle for that is that we should introduce a new page flag and
I believe if we all agree this approch, we can find a solution at last.

What do you think?

>From 9a4f652006b7d0c750933d738e1bd6f53754bcf6 Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan@...nel.org>
Date: Sun, 11 Aug 2013 00:31:57 +0900
Subject: [RFC] pin page control subsystem


Signed-off-by: Minchan Kim <minchan@...nel.org>
---
 mm/Makefile   |    2 +-
 mm/pin-page.c |  101 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 102 insertions(+), 1 deletion(-)
 create mode 100644 mm/pin-page.c

diff --git a/mm/Makefile b/mm/Makefile
index f008033..245c2f7 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -5,7 +5,7 @@
 mmu-y			:= nommu.o
 mmu-$(CONFIG_MMU)	:= fremap.o highmem.o madvise.o memory.o mincore.o \
 			   mlock.o mmap.o mprotect.o mremap.o msync.o rmap.o \
-			   vmalloc.o pagewalk.o pgtable-generic.o
+			   vmalloc.o pagewalk.o pgtable-generic.o pin-page.o
 
 ifdef CONFIG_CROSS_MEMORY_ATTACH
 mmu-$(CONFIG_MMU)	+= process_vm_access.o
diff --git a/mm/pin-page.c b/mm/pin-page.c
new file mode 100644
index 0000000..74b07f8
--- /dev/null
+++ b/mm/pin-page.c
@@ -0,0 +1,101 @@
+#include <linux/mm.h>
+#include <linux/slab.h>
+#include <linux/list.h>
+#include <linux/hashtable.h>
+
+#define PPAGE_HASH_BITS 10
+
+static DEFINE_SPINLOCK(hash_lock);
+/*
+ * Should consider what's data struct we should use.
+ * It would be better use radix tree if we try to pin contigous
+ * pages a lot but if we pin spread pages, it wouldn't be a good idea.
+ */
+static DEFINE_HASHTABLE(pin_page_hash, PPAGE_HASH_BITS);
+
+/*
+ * Each subsystems should provide own page migration handler
+ */
+struct pin_page_owner {
+	int (*migrate)(struct page *page, void *private);
+};
+
+struct pin_page_info {
+	struct pin_page_owner *owner;
+	struct hlist_node hlist;
+
+	unsigned long pfn;
+	void *private;
+};
+
+/* TODO : Introduce new page flags */
+void SetPinnedPage(struct page *page)
+{
+
+}
+
+int PinnedPage(struct page *page)
+{
+	return 0;
+}
+
+/*
+ * GUP caller or subsystems which pin the page should call this function
+ * to register @page in pin-page control subsystem so that VM can ask us
+ * when it want to migrate @page.
+ *  
+ * Each pinned page would have some private key to identify itself
+ * like custom-allocator-returned handle.
+ */
+int set_pinned_page(struct pin_page_owner *owner,
+			struct page *page, void *private)
+{
+	struct pin_page_info *pinfo = kmalloc(sizeof(pinfo), GFP_KERNEL);
+
+	INIT_HLIST_NODE(&pinfo->hlist);
+	pinfo->owner = owner;
+
+	pinfo->pfn = page_to_pfn(page);
+	pinfo->private = private;
+	
+	spin_lock(&hash_lock);
+	hash_add(pin_page_hash, &pinfo->hlist, pinfo->pfn);
+	spin_unlock(&hash_lock);
+
+	SetPinnedPage(page);
+	return 0;
+};
+
+struct pin_page_info *get_pin_page_info(struct page *page)
+{
+	struct pin_page_info *tmp;
+	unsigned long pfn = page_to_pfn(page);
+
+	spin_lock(&hash_lock);
+	hash_for_each_possible(pin_page_hash, tmp, hlist, pfn) {
+		if (tmp->pfn == pfn) {
+			spin_unlock(&hash_lock);
+			return tmp;
+		}
+	}
+	spin_unlock(&hash_lock);
+	return NULL;
+}
+
+/* Used in compaction.c */
+int migrate_pinned_page(struct page *page)
+{
+	int ret = 1;
+	struct pin_page_info *pinfo = NULL;
+
+	if (PinnedPage(page)) {
+		while ((pinfo = get_pin_page_info(page))) {
+			/* If one of owners failed, bail out */
+			if (pinfo->owner->migrate(page, pinfo->private))
+				break;
+		}
+
+		ret = 0;
+	}
+	return ret;
+}
-- 
1.7.9.5

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/