linux-kernel - Re: [PATCHv9-rebased2 01/37] mm, thp: make swapin readahead under down_read of mmap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160618190951.GA11151@debian>
Date:	Sat, 18 Jun 2016 22:09:51 +0300
From:	Ebru Akagunduz <ebru.akagunduz@...il.com>
To:	"Kirill A. Shutemov" <kirill@...temov.name>,
	Hillf Danton <hillf.zj@...baba-inc.com>
Cc:	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCHv9-rebased2 01/37] mm, thp: make swapin readahead under
 down_read of mmap_sem

On Thu, Jun 16, 2016 at 01:08:54PM +0300, Kirill A. Shutemov wrote:
> On Thu, Jun 16, 2016 at 02:52:52PM +0800, Hillf Danton wrote:
> > > 
> > > From: Ebru Akagunduz <ebru.akagunduz@...il.com>
> > > 
> > > Currently khugepaged makes swapin readahead under down_write.  This patch
> > > supplies to make swapin readahead under down_read instead of down_write.
> > > 
> > > The patch was tested with a test program that allocates 800MB of memory,
> > > writes to it, and then sleeps.  The system was forced to swap out all.
> > > Afterwards, the test program touches the area by writing, it skips a page
> > > in each 20 pages of the area.
> > > 
> > > Link: http://lkml.kernel.org/r/1464335964-6510-4-git-send-email-ebru.akagunduz@gmail.com
> > > Signed-off-by: Ebru Akagunduz <ebru.akagunduz@...il.com>
> > > Cc: Hugh Dickins <hughd@...gle.com>
> > > Cc: Rik van Riel <riel@...hat.com>
> > > Cc: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
> > > Cc: Naoya Horiguchi <n-horiguchi@...jp.nec.com>
> > > Cc: Andrea Arcangeli <aarcange@...hat.com>
> > > Cc: Joonsoo Kim <iamjoonsoo.kim@....com>
> > > Cc: Cyrill Gorcunov <gorcunov@...nvz.org>
> > > Cc: Mel Gorman <mgorman@...e.de>
> > > Cc: David Rientjes <rientjes@...gle.com>
> > > Cc: Vlastimil Babka <vbabka@...e.cz>
> > > Cc: Aneesh Kumar K.V <aneesh.kumar@...ux.vnet.ibm.com>
> > > Cc: Johannes Weiner <hannes@...xchg.org>
> > > Cc: Michal Hocko <mhocko@...e.cz>
> > > Cc: Minchan Kim <minchan.kim@...il.com>
> > > Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
> > > ---
> > >  mm/huge_memory.c | 92 ++++++++++++++++++++++++++++++++++++++------------------
> > >  1 file changed, 63 insertions(+), 29 deletions(-)
> > > 
> > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > > index f2bc57c45d2f..96dfe3f09bf6 100644
> > > --- a/mm/huge_memory.c
> > > +++ b/mm/huge_memory.c
> > > @@ -2378,6 +2378,35 @@ static bool hugepage_vma_check(struct vm_area_struct *vma)
> > >  }
> > > 
> > >  /*
> > > + * If mmap_sem temporarily dropped, revalidate vma
> > > + * before taking mmap_sem.
> > 
> > See below
> 
> > > @@ -2401,11 +2430,18 @@ static void __collapse_huge_page_swapin(struct mm_struct *mm,
> > >  			continue;
> > >  		swapped_in++;
> > >  		ret = do_swap_page(mm, vma, _address, pte, pmd,
> > > -				   FAULT_FLAG_ALLOW_RETRY|FAULT_FLAG_RETRY_NOWAIT,
> > > +				   FAULT_FLAG_ALLOW_RETRY,
> > 
> > Add a description in change log for it please.
> 
> Ebru, would you address it?
> 
This changelog really seems poor.
Is there a way to update only changelog of the commit?
I tried to use git rebase to amend commit, however
I could not rebase. This patch only needs better changelog.

I would like to update it as follows, if you would like to too:

"
Currently khugepaged makes swapin readahead under down_write.  This patch
supplies to make swapin readahead under down_read instead of down_write.

Along swapin, we can need to drop and re-take mmap_sem. Therefore we
have to be sure vma is consistent. This patch adds a helper function
to validate vma and also supplies that async swapin should not be
performed without waiting.

The patch was tested with a test program that allocates 800MB of memory,
writes to it, and then sleeps.  The system was forced to swap out all.
Afterwards, the test program touches the area by writing, it skips a page
in each 20 pages of the area.
"

Could you please suggest me a way to replace above changelog with the old?

> > >  				   pteval);
> > > +		/* do_swap_page returns VM_FAULT_RETRY with released mmap_sem */
> > > +		if (ret & VM_FAULT_RETRY) {
> > > +			down_read(&mm->mmap_sem);
> > > +			/* vma is no longer available, don't continue to swapin */
> > > +			if (hugepage_vma_revalidate(mm, vma, address))
> > > +				return false;
> > 
> > Revalidate vma _after_ acquiring mmap_sem, but the above comment says _before_.
> 
> Ditto.
> 
> > > +	if (!__collapse_huge_page_swapin(mm, vma, address, pmd)) {
> > > +		up_read(&mm->mmap_sem);
> > > +		goto out;
> > 
> > Jump out with mmap_sem released, 
> > 
> > > +	result = hugepage_vma_revalidate(mm, vma, address);
> > > +	if (result)
> > > +		goto out;
> > 
> > but jump out again with mmap_sem held.
> > 
> > They are cleaned up in subsequent darns?
> 
Yes, that is reported and fixed here:
http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=fc7038a69cee6b817261f7cd805e9663fdc1075c

However, the above comment inconsistency still there.
I've added a fix patch:

>From 404438ff1b0617cbf7434cba0c5a08f79ccb8a5d Mon Sep 17 00:00:00 2001
From: Ebru Akagunduz <ebru.akagunduz@...il.com>
Date: Sat, 18 Jun 2016 21:07:22 +0300
Subject: [PATCH] mm, thp: fix comment inconsistency for swapin readahead
 functions

Signed-off-by: Ebru Akagunduz <ebru.akagunduz@...il.com>
---
 mm/huge_memory.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index acd374e..f0d528e 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2436,9 +2436,10 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm,
 		/* do_swap_page returns VM_FAULT_RETRY with released mmap_sem */
 		if (ret & VM_FAULT_RETRY) {
 			down_read(&mm->mmap_sem);
-			/* vma is no longer available, don't continue to swapin */
-			if (hugepage_vma_revalidate(mm, address))
+			if (hugepage_vma_revalidate(mm, address)) {
+				/* vma is no longer available, don't continue to swapin */
 				return false;
+			}
 		}
 		if (ret & VM_FAULT_ERROR) {
 			trace_mm_collapse_huge_page_swapin(mm, swapped_in, 0);
@@ -2513,8 +2514,8 @@ static void collapse_huge_page(struct mm_struct *mm,
 	if (allocstall == curr_allocstall && swap != 0) {
 		/*
 		 * __collapse_huge_page_swapin always returns with mmap_sem
-		 * locked.  If it fails, release mmap_sem and jump directly
-		 * out.  Continuing to collapse causes inconsistency.
+		 * locked. If it fails, we release mmap_sem and jump out_nolock.
+		 * Continuing to collapse causes inconsistency.
 		 */
 		if (!__collapse_huge_page_swapin(mm, vma, address, pmd)) {
 			mem_cgroup_cancel_charge(new_page, memcg, true);
-- 
1.9.1