lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 3 Jul 2014 17:37:29 +0900
From:	Minchan Kim <minchan@...nel.org>
To:	Martin Schwidefsky <schwidefsky@...ibm.com>
Cc:	"Kirill A. Shutemov" <kirill@...temov.name>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	Michael Kerrisk <mtk.manpages@...il.com>,
	Linux API <linux-api@...r.kernel.org>,
	Hugh Dickins <hughd@...gle.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Rik van Riel <riel@...hat.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Mel Gorman <mgorman@...e.de>, Jason Evans <je@...com>,
	Zhang Yanfei <zhangyanfei@...fujitsu.com>,
	Heiko Carstens <heiko.carstens@...ibm.com>,
	linux390@...ibm.com, Gerald Schaefer <gerald.schaefer@...ibm.com>
Subject: Re: [PATCH v9] mm: support madvise(MADV_FREE)

Hello,

On Thu, Jul 03, 2014 at 10:29:01AM +0200, Martin Schwidefsky wrote:
> On Thu, 3 Jul 2014 16:29:54 +0900
> Minchan Kim <minchan@...nel.org> wrote:
> 
> > Hello,
> > 
> > On Thu, Jul 03, 2014 at 10:03:19AM +0900, Minchan Kim wrote:
> > > Hello,
> > > 
> > > On Tue, Jul 01, 2014 at 05:50:58PM +0300, Kirill A. Shutemov wrote:
> > > > On Tue, Jul 01, 2014 at 09:36:15AM +0900, Minchan Kim wrote:
> > > > > +	do {
> > > > > +		/*
> > > > > +		 * XXX: We can optimize with supporting Hugepage free
> > > > > +		 * if the range covers.
> > > > > +		 */
> > > > > +		next = pmd_addr_end(addr, end);
> > > > > +		if (pmd_trans_huge(*pmd))
> > > > > +			split_huge_page_pmd(vma, addr, pmd);
> > > > 
> > > > Could you implement proper THP support before upstreaming the feature?
> > > > It shouldn't be a big deal.
> > > 
> > > Okay, Hope to review.
> > > 
> > > Thanks for the feedback!
> > > 
> > 
> > I tried to implement it but had a issue.
> > 
> > I need pmd_mkold, pmd_mkclean for MADV_FREE operation and pmd_dirty for
> > page_referenced. When I investigate all of arches supported THP,
> > it's not a big deal but s390 is not sure to me who has no idea of
> > soft tracking of s390 by storage key instead of page table information.
> > Cced s390 maintainer. Hope to help.
> 
> Storage key for dirty and referenced tracking is a thing of the past.
> The current code for s390 uses software tracking for dirty and referenced.
> There is one catch though, for ptes the software implementation covers
> dirty and referenced bit but for pmds only referenced bit is available.
> The reason is that there is no free bit left in the pmd entry for the
> software dirty bit.

Thanks for the quick reply.

>  
> > So, if there isn't any help from s390, I should introduce
> > HAVE_ARCH_THP_MADVFREE to disable MADV_FREE support of THP in s390 but
> > not want to introduce such new config.
> 
> Why is the dirty bit for pmds needed for the MADV_FREE implementation?

MADV_FREE semantic want it.

When madvise syscall is called, VM clears dirty bit of ptes of
the range. If memory pressure happens, VM checks dirty bit of
page table and if it found still "clean", it means it's a
"lazyfree pages" so VM could discard the page instead of swapping out.
Once there was store operation for the page before VM peek a page
to reclaim, dirty bit is set so VM can swap out the page instead of
discarding to keep up-to-date contents.

If it's hard on s390, maybe we could use just reference bit
instead of dirty bit to check recent access but it might change
semantic a bit with other OSes. :(

> 
> -- 
> blue skies,
>    Martin.
> 
> "Reality continues to ruin my life." - Calvin.
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@...ck.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@...ck.org"> email@...ck.org </a>

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ