linux-kernel - Re: [PATCH RESEND v2 2/2] xen: enable vnuma for PV guest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20131119151924.GC5790@phenom.dumpdata.com>
Date:	Tue, 19 Nov 2013 10:19:24 -0500
From:	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
To:	David Vrabel <david.vrabel@...rix.com>
Cc:	Elena Ufimtseva <ufimtseva@...il.com>,
	xen-devel@...ts.xenproject.org, boris.ostrovsky@...cle.com,
	tglx@...utronix.de, mingo@...hat.com, hpa@...or.com,
	x86@...nel.org, akpm@...ux-foundation.org, tangchen@...fujitsu.com,
	wency@...fujitsu.com, ian.campbell@...rix.com,
	stefano.stabellini@...citrix.com, mukesh.rathor@...cle.com,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH RESEND v2 2/2] xen: enable vnuma for PV guest

On Tue, Nov 19, 2013 at 02:56:41PM +0000, David Vrabel wrote:
> On 19/11/13 14:46, Konrad Rzeszutek Wilk wrote:
> > On Tue, Nov 19, 2013 at 02:35:59PM +0000, David Vrabel wrote:
> >> On 19/11/13 14:16, Konrad Rzeszutek Wilk wrote:
> >>> On Tue, Nov 19, 2013 at 11:54:08AM +0000, David Vrabel wrote:
> >>>> On 18/11/13 21:58, Elena Ufimtseva wrote:
> >>>>> Enables numa if vnuma topology hypercall is supported and it is domU.
> >>>> [...]
> >>>>> --- a/arch/x86/xen/setup.c
> >>>>> +++ b/arch/x86/xen/setup.c
> >>>>> @@ -20,6 +20,7 @@
> >>>>>  #include <asm/numa.h>
> >>>>>  #include <asm/xen/hypervisor.h>
> >>>>>  #include <asm/xen/hypercall.h>
> >>>>> +#include <asm/xen/vnuma.h>
> >>>>>  
> >>>>>  #include <xen/xen.h>
> >>>>>  #include <xen/page.h>
> >>>>> @@ -598,6 +599,9 @@ void __init xen_arch_setup(void)
> >>>>>  	WARN_ON(xen_set_default_idle());
> >>>>>  	fiddle_vdso();
> >>>>>  #ifdef CONFIG_NUMA
> >>>>> -	numa_off = 1;
> >>>>> +	if (!xen_initial_domain() && xen_vnuma_supported())
> >>>>> +		numa_off = 0;
> >>>>> +	else
> >>>>> +		numa_off = 1;
> >>>>>  #endif
> >>>>>  }
> >>>>
> >>>> I think this whole #ifdef CONFIG_NUMA can be removed and hence
> >>>> xen_vnuma_supported() can be removed as well.
> >>>>
> >>>> For any PV guest we can call the xen_numa_init() and it will do the
> >>>> right thing.
> >>>>
> >>>> For dom0, the hypercall will either: return something sensible (if in
> >>>> the future Xen sets something up), or it will error.
> >>>>
> >>>> If Xen does not have vnuma support, the hypercall will error.
> >>>>
> >>>> In both error cases, the dummy numa node is setup as required.
> >>>
> >>> Incorrect. It will end up calling:
> >>>
> >>>                  if (!numa_init(amd_numa_init))                                  
> >>>
> >>> which will crash dom0 (see 8d54db795 "xen/boot: Disable NUMA for PV guests.")
> >>> as that amd_numa_init is called before the dummy node init.
> >>
> >> No it won't.  Any error path after the check for a PV guest will add the
> >> dummy node and return success, skipping any of the hardware-specific setup.
> > 
> > Duh! I totally missed 'return' at the end of the check!
> > 
> > However, even with that (so the return), that means
> > this part won't be called:
> > 
> > 649         numa_init(dummy_numa_init);                                             
> > 
> > Which means there won't be any dummy numa setup?
> 
> The relevant bits in dummy_numa_init are in the error path of
> xen_numa_init().

That seems the wrong place to do it. The top layer calls 
in each of the numa implementations and then falls back to
the dummy.

Calling from within the implementation on something that is eventually
done on the upper level already is not right.
> 
> I do think this approach (using the provided API to setup the single
> (dummy) node), is preferable to calling dummy_numa_init().

Doesn't it do the same thing? And also what about if you the user
provides fakenuma?

> 
> If I thought the hypervisor ABI was finalized, I'd be happy with this
> series as-is -- the remaining issues are superficial.

That reads to me as an Ack, but I know you like to have it stated
explicitly - so could you state the proper tag please?

> 
> David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/