linux-kernel - Re: [PATCH] locking/osq: Drop the overload of osq lock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160626141128.GC6512@insomnia>
Date:	Sun, 26 Jun 2016 22:11:28 +0800
From:	Boqun Feng <boqun.feng@...il.com>
To:	panxinhui <xinhui@...ux.vnet.ibm.com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Pan Xinhui <xinhui.pan@...ux.vnet.ibm.com>,
	linux-kernel@...r.kernel.org, mingo@...hat.com, dave@...olabs.net,
	will.deacon@....com, Waiman.Long@....com, benh@...nel.crashing.org
Subject: Re: [PATCH] locking/osq: Drop the overload of osq lock

On Sun, Jun 26, 2016 at 02:58:00PM +0800, panxinhui wrote:
[snip]
> >> 
> > 
> > Unfortunately, on PPC, we compile pseries code along with powernv code
> > in a kernel binary, therefore we need to wire the proper primitives at
> > runtime.
> > 
> no, we can use same vcpu preempted check ops actually.
> just like how arch_spin_lock() does.
> 
> if (SHARED_PROCESSOR)
>                                 __spin_yield(lock); 
> 
> so we can
> u32 arch_get_vcpu_yield_count(int cpu)
> {
> 	if (SHARED_PROCESSOR)

SHARED_PROCESSOR is true only if a PPC_PSERIES kernel runs in shared
processor mode. Using this here will rule out the situation where a
PPC_PSERIES kernel(ppc guest) runs in dedicated processor mode. I don't
think we can rule out the dedicated processor mode, because even in the
dedicated mode, the lock holder may still be preempted. So why? Why do
we only want the yield_count in the shared processor mode?

>                                 return be32_to_cpu(lppaca_of(cpu).yield_count;
> 	return 0;
> }
> 
> and all these things can be don inline then.
> 
> > I've cooked the following patch, please note I haven't test this, this
> > is just a POC.
> > 
> > Regards,
> > Boqun
> > 
> > ---------------->8
> > virt: Introduce vcpu preemption detection functions
> > 
> > Lock holder preemption is an serious problem in virtualized environment
> > for locking. Different archs and hypervisors employ different ways to
> > try to solve the problem in different locking mechanisms. And recently
> > Pan Xinhui found a significant performance drop in osq_lock(), which
> > could be solved by providing a mechanism to detect the preemption of
> > vcpu.
> > 
> > Therefore this patch introduces several vcpu preemption detection
> > primitives, which allows locking primtives to save the time of
> > unnecesarry spinning. These primitives act as an abstract layer between
> > locking functions and arch or hypervisor related code.
> > 
> > There are two sets of primitives:
> > 
> > 1.	vcpu_preempt_count() and vcpu_has_preempted(), they must be used
> > 	pairwisely in a same preempt disable critical section. And they
> > 	could detect whether a vcpu preemption happens between them.
> > 
> > 2.	vcpu_is_preempted(), used to check whether other cpu's vcpu is
> > 	preempted.
> > 
> > This patch also implements those primitives on pseries and wire them up.
> > 
> > Signed-off-by: Boqun Feng <boqun.feng@...il.com>
> > ---
> > arch/powerpc/platforms/pseries/Kconfig |  1 +
> > arch/powerpc/platforms/pseries/setup.c | 28 +++++++++++
> > include/linux/vcpu_preempt.h           | 90 ++++++++++++++++++++++++++++++++++
> > 3 files changed, 119 insertions(+)
> > create mode 100644 include/linux/vcpu_preempt.h
> > 
> > diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig
> > index bec90fb30425..7d24c3e48878 100644
> > --- a/arch/powerpc/platforms/pseries/Kconfig
> > +++ b/arch/powerpc/platforms/pseries/Kconfig
> > @@ -21,6 +21,7 @@ config PPC_PSERIES
> > 	select HOTPLUG_CPU if SMP
> > 	select ARCH_RANDOM
> > 	select PPC_DOORBELL
> > +	select HAS_VCPU_PREEMPTION_DETECTION
> > 	default y
> > 
> it is already too many config there…

Why is this a problem? It's a Kconfig file, there are not too many
readability issues here. It simply serve as a database for the
relationships between different config options. menuconfig and its
friends can do their job well. BTW, have you read config X86? ;-)

> as I comment above, we could make these things inline, so looks like we needn't to add such config.
>  

Another reason, I didn't make those primitives arch-specific inline
functions, is that they are actually not only arch-specific but also
(virtual-)environment-specific, or hypervisor-specific, that is their
implementations differ when the kernel runs on different hypervisors(KVM
and Xen, for example).

If you make the implementation part inline functions, you end up
handling different virtual environment every time the functions are
called, which is a waste of time. We are lucky because phyp and kvm
share the same interface of yield count, but this might not be true for
other archs and other hypervisors.

> > config PPC_SPLPAR
> > diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
> > index 9883bc7ea007..5d4aed54e039 100644
> > --- a/arch/powerpc/platforms/pseries/setup.c
> > +++ b/arch/powerpc/platforms/pseries/setup.c
> > @@ -42,6 +42,7 @@
> > #include <linux/of.h>
> > #include <linux/of_pci.h>
> > #include <linux/kexec.h>
> > +#include <linux/vcpu_preempt.h>
> > 
> > #include <asm/mmu.h>
> > #include <asm/processor.h>
> > @@ -501,6 +502,31 @@ static void __init find_and_init_phbs(void)
> > 	of_pci_check_probe_only();
> > }
> > 
> > +struct vcpu_preempt_ops vcpu_preempt_ops = DEFAULT_VCPU_PREEMPT_OPS;
> > +EXPORT_SYMBOL(vcpu_preempt_ops);
> > +
> > +static long pseries_vcpu_preempt_count(void)
> > +{
> > +	return be32_to_cpu(get_lppaca()->yield_count);
> > +}
> > +
> > +static bool pseries_vcpu_is_preempted(int cpu)
> > +{
> > +	return !!(be32_to_cpu(lppaca_of(cpu).yield_count) & 1);
> > +}
> > +
> > +static bool pseries_vcpu_has_preempted(long vpc)
> > +{
> > +	return pseries_vcpu_preempt_count() != vpc;
> > +}
> > +
> > +static void __init pseries_setup_vcpu_preempt_ops(void)
> > +{
> > +	vcpu_preempt_ops.preempt_count = pseries_vcpu_preempt_count;
> > +	vcpu_preempt_ops.is_preempted = pseries_vcpu_is_preempted;
> > +	vcpu_preempt_ops.has_preempted = pseries_vcpu_has_preempted;
> > +}
> > +
> > static void __init pSeries_setup_arch(void)
> > {
> > 	set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT);
> > @@ -549,6 +575,8 @@ static void __init pSeries_setup_arch(void)
> > 				"%ld\n", rc);
> > 		}
> > 	}
> > +
> > +	pseries_setup_vcpu_preempt_ops();
> > }
> > 
> > static int __init pSeries_init_panel(void)
> > diff --git a/include/linux/vcpu_preempt.h b/include/linux/vcpu_preempt.h
> > new file mode 100644
> > index 000000000000..4a88414bb83f
> > --- /dev/null
> > +++ b/include/linux/vcpu_preempt.h
> 
> How about add them in sched.h

I don't have a strong opinion on which header file to use. TBH, I was
trying to use some header files for parvirt stuff, but failed to find
one, so I add a new one because I don't want people confused between
preemption of vcpus and preemption of tasks. But once again, I don't
have a strong opinion ;-)

> I just think  they are similar. correct me if  I got a mistake :)
> 
> thanks
> xinhui
> 
> > @@ -0,0 +1,90 @@
> > +/*
> > + * Primitives for checking the vcpu preemption from the guest.
> > + */
> > +
> > +static long __vcpu_preempt_count(void)
> > +{
> > +	return 0;
> > +}
> > +
> > +static bool __vcpu_has_preempted(long vpc)
> > +{
> > +	return false;
> > +}
> > +
> > +static bool __vcpu_is_preempted(int cpu)
> > +{
> > +	return false;
> > +}
> > +
> > +struct vcpu_preempt_ops {
> > +	/*
> > +	 * Get the current vcpu's "preempt count", which is going to use for
> > +	 * checking whether the current vcpu has ever been preempted
> > +	 */
> > +	long (*preempt_count)(void);
> > +
> > +	/*
> > +	 * Return whether a vcpu is preempted
> > +	 */
> > +	bool (*is_preempted)(int cpu);
> > +
> > +	/*
> > +	 * Given a "vcpu preempt count", Return whether a vcpu preemption ever
> > +	 * happened after the .preempt_count() was called.
> > +	 */
> > +	bool (*has_preempted)(long vpc);
> > +};
> > +
> > +struct vcpu_preempt_ops vcpu_preempt_ops;
> > +
> > +/* Default boilerplate */
> > +#define DEFAULT_VCPU_PREEMPT_OPS			\
> > +	{						\
> > +		.preempt_count = __vcpu_preempt_count,	\
> > +		.is_preempted = __vcpu_is_preempted,	\
> > +		.has_preempted = __vcpu_has_preempted	\
> > +	}
> > +
> > +#ifdef CONFIG_HAS_VCPU_PREEMPTION_DETECTION
> > +/*
> > + * vcpu_preempt_count: Get the current cpu's "vcpu preempt count"(vpc).
> > + *
> > + * The vpc is used for checking whether the current vcpu has ever been
> > + * preempted via vcpu_has_preempted().
> > + *
> > + * This function and vcpu_has_preepmted() should be called in the same
> > + * preemption disabled critical section.
> > + */
> > +static long vcpu_preempt_count(void)
> > +{
> > +	return vcpu_preempt_ops.preempt_count();
> > +}
> > +
> > +/*
> > + * vcpu_is_preempted: Check whether @cpu's vcpu is preempted.
> > + */
> > +static bool vcpu_is_preempted(int cpu)
> > +{
> > +	return vcpu_preempt_ops.is_preempted(cpu);
> > +}
> > +
> > +/*
> > + * vcpu_has_preepmted: Check whether the current cpu's vcpu has ever been
> > + * preempted.
> > + *
> > + * The checked duration is between the vcpu_preempt_count() which returns @vpc
> > + * is called and this function called.
> > + *
> > + * This function and corresponding vcpu_preempt_count() should be in the same
> > + * preemption disabled cirtial section.
> > + */
> > +static bool vcpu_has_preempted(long vpc)
> > +{
> > +	return vcpu_preempt_ops.has_preempt(vpc);
> > +}
> > +#else /* CONFIG_HAS_VCPU_PREEMPTION_DETECTION */
> > +#define vcpu_preempt_count() __vcpu_preempt_count()
> > +#define vcpu_is_preempted(cpu) __vcpu_is_preempted(cpu)
> > +#define vcpu_has_preempted(vpc) __vcpu_has_preempted(vpc)
> > +#endif /* CONFIG_HAS_VCPU_PREEPMTION_DETECTION */
> > -- 
> > 2.9.0
> > 
> 

Download attachment "signature.asc" of type "application/pgp-signature" (474 bytes)