lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20121011200109.GN3317@csn.ul.ie>
Date:	Thu, 11 Oct 2012 21:01:09 +0100
From:	Mel Gorman <mel@....ul.ie>
To:	Andrea Arcangeli <aarcange@...hat.com>
Cc:	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Peter Zijlstra <pzijlstr@...hat.com>,
	Ingo Molnar <mingo@...e.hu>, Hugh Dickins <hughd@...gle.com>,
	Rik van Riel <riel@...hat.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Hillf Danton <dhillf@...il.com>,
	Andrew Jones <drjones@...hat.com>,
	Dan Smith <danms@...ibm.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Paul Turner <pjt@...gle.com>, Christoph Lameter <cl@...ux.com>,
	Suresh Siddha <suresh.b.siddha@...el.com>,
	Mike Galbraith <efault@....de>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: [PATCH 06/33] autonuma: teach gup_fast about pmd_numa

On Thu, Oct 11, 2012 at 07:05:33PM +0200, Andrea Arcangeli wrote:
> On Thu, Oct 11, 2012 at 01:22:55PM +0100, Mel Gorman wrote:
> > On Thu, Oct 04, 2012 at 01:50:48AM +0200, Andrea Arcangeli wrote:
> > > In the special "pmd" mode of knuma_scand
> > > (/sys/kernel/mm/autonuma/knuma_scand/pmd == 1), the pmd may be of numa
> > > type (_PAGE_PRESENT not set), however the pte might be
> > > present. Therefore, gup_pmd_range() must return 0 in this case to
> > > avoid losing a NUMA hinting page fault during gup_fast.
> > > 
> > 
> > So if gup_fast fails, presumably we fall back to taking the mmap_sem and
> > calling get_user_pages(). This is a heavier operation and I wonder if the
> > cost is justified. i.e. Is the performance loss from using get_user_pages()
> > offset by improved NUMA placement? I ask because we always incur the cost of
> > taking mmap_sem but only sometimes get it back from improved NUMA placement.
> > How bad would it be if gup_fast lost some of the NUMA hinting information?
> 
> Good question indeed. Now, I agree it wouldn't be bad to skip NUMA
> hinting page faults in gup_fast for no-virt usage like
> O_DIRECT/ptrace, but the only problem is that we'd lose AutoNUMA on
> the memory touched by the KVM vcpus.
> 

Ok I see, that could be in the changelog because it's not immediately
obvious. At least, it's not as obvious as the potential downside (more GUP
fallbacks). In this context there is no way to guess what type of access
it is. AFAIK, there is no way from here to tell if it's KVM calling gup
or if it's due to O_DIRECT.

> I've been also asked if the vhost-net kernel thread (KVM in kernel
> virtio backend) will be controlled by autonuma in between
> use_mm/unuse_mm and answer is yes, but to do that, it also needs
> this. (see also the flush to task_autonuma_nid and mm/task statistics in
> unuse_mm to reset it back to regular kernel thread status,
> uncontrolled by autonuma)

I can understand why it needs this now. The clearing of the statistics is
still not clear to me but I asked that question in the thread that adjusts
unuse_mm already.

> 
> $ git grep get_user_pages
> tcm_vhost.c:            ret = get_user_pages_fast((unsigned long)ptr, 1, write, &page);
> vhost.c:        r = get_user_pages_fast(log, 1, 1, &page);
> 

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ