lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130205074519.GB11197@blaptop>
Date:	Tue, 5 Feb 2013 16:45:19 +0900
From:	Minchan Kim <minchan@...nel.org>
To:	Lin Feng <linfeng@...fujitsu.com>
Cc:	akpm@...ux-foundation.org, mgorman@...e.de, bcrl@...ck.org,
	viro@...iv.linux.org.uk, khlebnikov@...nvz.org, walken@...gle.com,
	kamezawa.hiroyu@...fujitsu.com, riel@...hat.com,
	rientjes@...gle.com, isimatu.yasuaki@...fujitsu.com,
	wency@...fujitsu.com, laijs@...fujitsu.com, jiang.liu@...wei.com,
	linux-mm@...ck.org, linux-aio@...ck.org,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/2] mm: hotplug: implement non-movable version of
 get_user_pages() to kill long-time pin pages

On Tue, Feb 05, 2013 at 02:18:42PM +0800, Lin Feng wrote:
> 
> 
> On 02/05/2013 01:25 PM, Minchan Kim wrote:
> > Hi Lin,
> > 
> > On Tue, Feb 05, 2013 at 12:42:48PM +0800, Lin Feng wrote:
> >> Hi Minchan,
> >>
> >> On 02/05/2013 08:58 AM, Minchan Kim wrote:
> >>> Hello,
> >>>
> >>> On Mon, Feb 04, 2013 at 06:04:06PM +0800, Lin Feng wrote:
> >>>> Currently get_user_pages() always tries to allocate pages from movable zone,
> >>>> as discussed in thread https://lkml.org/lkml/2012/11/29/69, in some case users
> >>>> of get_user_pages() is easy to pin user pages for a long time(for now we found
> >>>> that pages pinned as aio ring pages is such case), which is fatal for memory
> >>>> hotplug/remove framework.
> >>>>
> >>>> So the 1st patch introduces a new library function called
> >>>> get_user_pages_non_movable() to pin pages only from zone non-movable in memory.
> >>>> It's a wrapper of get_user_pages() but it makes sure that all pages come from
> >>>> non-movable zone via additional page migration.
> >>>>
> >>>> The 2nd patch gets around the aio ring pages can't be migrated bug caused by
> >>>> get_user_pages() via using the new function. It only works when configed with
> >>>> CONFIG_MEMORY_HOTREMOVE, otherwise it uses the old version of get_user_pages().
> >>>
> >>> CMA has same issue but the problem is the driver developers or any subsystem
> >>> using GUP can't know their pages is in CMA area or not in advance.
> >>> So all of client of GUP should use GUP_NM to work them with CMA/MEMORY_HOTPLUG well?
> >>> Even some driver module in embedded side doesn't open their source code.
> >> Yes, it somehow depends on the users of GUP. In MEMORY_HOTPLUG case, as for most users
> >> of GUP, they will release the pinned pages immediately and to such users they should get
> >> a good performance, using the old style interface is a smart way. And we had better just
> >> deal with the cases we have to by using the new interface.
> > 
> > Hmm, I think you can't make sure most of user for MEMORY_HOTPLUG will release pinned pages
> > immediately. Because MEMORY_HOTPLUG could be used for embedded system for reducing power
> > by PASR and some drivers in embedded could use GUP anytime and anywhere. They can't know
> > in advance they will use pinned pages long time or release in short time because it depends
> > on some event like user's response which is very not predetermined.
> > So for solving it, we can add some WARN_ON in CMA/MEMORY_HOTPLUG part just in case of
> > failing migration by page count and then, investigate they are really using GUP and it's
> > REALLY a culprit. If so, yell to them "Please use GUP_NM instead"?
> > 
> > Yes. it could be done but it would be rather trobulesome job.
> Yes WARN_ON may be easy while troubleshooting for finding the immigrate-able page is 
> a big job.
> If we want to kill all the potential immigrate-able pages caused by GUP we'd better use the
> *non_movable* version of GUP.
> But in some server environment we want to keep the performance but also want to use hotremove
> feature in case. Maybe patch the place as we need is a trade off for such support.
> 
> Mel also said in the last discussion:
> 
> On 11/30/2012 07:00 PM, Mel Gorman wrote:> On Thu, Nov 29, 2012 at 11:55:02PM -0800, Andrew Morton wrote:
> >> Well, that's a fairly low-level implementation detail.  A more typical
> >> approach would be to add a new get_user_pages_non_movable() or such. 
> >> That would probably have the same signature as get_user_pages(), with
> >> one additional argument.  Then get_user_pages() becomes a one-line
> >> wrapper which passes in a particular value of that argument.
> >>
> > 
> > That is going in the direction that all pinned pages become MIGRATE_UNMOVABLE
> > allocations.  That will impact THP availability by increasing the number
> > of MIGRATE_UNMOVABLE blocks that exist and it would hit every user --
> > not just those that care about ZONE_MOVABLE.
> > 
> > I'm likely to NAK such a patch if it's only about node hot-remove because
> > it's much more of a corner case than wanting to use THP.
> > 
> > I would prefer if get_user_pages() checked if the page it was about to
> > pin was in ZONE_MOVABLE and if so, migrate it at that point before it's
> > pinned. It'll be expensive but will guarantee ZONE_MOVABLE availability
> > if that's what they want. The CMA people might also want to take
> > advantage of this if the page happened to be in the MIGRATE_CMA
> > pageblock.
> > 
> 
> So it may not a good idea that we all fall into calling the *non_movable* version of
> GUP when CONFIG_MIGRATE_ISOLATE is on. What do you think?

Frankly speaking, I can't understand Mel's comment.
AFAIUC, he said GUP checks the page before get_page and if the page is movable zone,
then migrate it out of movable zone and get_page again.
That's exactly what I want. It doesn't introduce GUP_NM.
-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ