lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51F7ED20.80202@linux.vnet.ibm.com>
Date:	Tue, 30 Jul 2013 22:13:12 +0530
From:	Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>
To:	Gleb Natapov <gleb@...hat.com>, mingo@...hat.com
CC:	jeremy@...p.org, x86@...nel.org, konrad.wilk@...cle.com,
	hpa@...or.com, pbonzini@...hat.com, linux-doc@...r.kernel.org,
	habanero@...ux.vnet.ibm.com, xen-devel@...ts.xensource.com,
	peterz@...radead.org, mtosatti@...hat.com,
	stefano.stabellini@...citrix.com, andi@...stfloor.org,
	attilio.rao@...rix.com, ouyang@...pitt.edu, gregkh@...e.de,
	agraf@...e.de, chegu_vinod@...com, torvalds@...ux-foundation.org,
	avi.kivity@...il.com, tglx@...utronix.de, kvm@...r.kernel.org,
	linux-kernel@...r.kernel.org, riel@...hat.com, drjones@...hat.com,
	virtualization@...ts.linux-foundation.org,
	srivatsa.vaddagiri@...il.com
Subject: Re: [PATCH RFC V11 15/18] kvm : Paravirtual ticketlocks support for
 linux guests running on KVM hypervisor

On 07/25/2013 03:08 PM, Raghavendra K T wrote:
> On 07/25/2013 02:45 PM, Gleb Natapov wrote:
>> On Thu, Jul 25, 2013 at 02:47:37PM +0530, Raghavendra K T wrote:
>>> On 07/24/2013 06:06 PM, Raghavendra K T wrote:
>>>> On 07/24/2013 05:36 PM, Gleb Natapov wrote:
>>>>> On Wed, Jul 24, 2013 at 05:30:20PM +0530, Raghavendra K T wrote:
>>>>>> On 07/24/2013 04:09 PM, Gleb Natapov wrote:
>>>>>>> On Wed, Jul 24, 2013 at 03:15:50PM +0530, Raghavendra K T wrote:
>>>>>>>> On 07/23/2013 08:37 PM, Gleb Natapov wrote:
>>>>>>>>> On Mon, Jul 22, 2013 at 11:50:16AM +0530, Raghavendra K T wrote:
>>>>>>>>>> +static void kvm_lock_spinning(struct arch_spinlock *lock,
>>>>>>>>>> __ticket_t want)
>>>>>>>> [...]
>>>>>>>>>> +
>>>>>>>>>> +    /*
>>>>>>>>>> +     * halt until it's our turn and kicked. Note that we do safe
>>>>>>>>>> halt
>>>>>>>>>> +     * for irq enabled case to avoid hang when lock info is
>>>>>>>>>> overwritten
>>>>>>>>>> +     * in irq spinlock slowpath and no spurious interrupt occur
>>>>>>>>>> to save us.
>>>>>>>>>> +     */
>>>>>>>>>> +    if (arch_irqs_disabled_flags(flags))
>>>>>>>>>> +        halt();
>>>>>>>>>> +    else
>>>>>>>>>> +        safe_halt();
>>>>>>>>>> +
>>>>>>>>>> +out:
>>>>>>>>> So here now interrupts can be either disabled or enabled. Previous
>>>>>>>>> version disabled interrupts here, so are we sure it is safe to
>>>>>>>>> have them
>>>>>>>>> enabled at this point? I do not see any problem yet, will keep
>>>>>>>>> thinking.
>>>>>>>>
>>>>>>>> If we enable interrupt here, then
>>>>>>>>
>>>>>>>>
>>>>>>>>>> +    cpumask_clear_cpu(cpu, &waiting_cpus);
>>>>>>>>
>>>>>>>> and if we start serving lock for an interrupt that came here,
>>>>>>>> cpumask clear and w->lock=null may not happen atomically.
>>>>>>>> if irq spinlock does not take slow path we would have non null
>>>>>>>> value
>>>>>>>> for lock, but with no information in waitingcpu.
>>>>>>>>
>>>>>>>> I am still thinking what would be problem with that.
>>>>>>>>
>>>>>>> Exactly, for kicker waiting_cpus and w->lock updates are
>>>>>>> non atomic anyway.
>>>>>>>
>>>>>>>>>> +    w->lock = NULL;
>>>>>>>>>> +    local_irq_restore(flags);
>>>>>>>>>> +    spin_time_accum_blocked(start);
>>>>>>>>>> +}
>>>>>>>>>> +PV_CALLEE_SAVE_REGS_THUNK(kvm_lock_spinning);
>>>>>>>>>> +
>>>>>>>>>> +/* Kick vcpu waiting on @lock->head to reach value @ticket */
>>>>>>>>>> +static void kvm_unlock_kick(struct arch_spinlock *lock,
>>>>>>>>>> __ticket_t ticket)
>>>>>>>>>> +{
>>>>>>>>>> +    int cpu;
>>>>>>>>>> +
>>>>>>>>>> +    add_stats(RELEASED_SLOW, 1);
>>>>>>>>>> +    for_each_cpu(cpu, &waiting_cpus) {
>>>>>>>>>> +        const struct kvm_lock_waiting *w =
>>>>>>>>>> &per_cpu(lock_waiting, cpu);
>>>>>>>>>> +        if (ACCESS_ONCE(w->lock) == lock &&
>>>>>>>>>> +            ACCESS_ONCE(w->want) == ticket) {
>>>>>>>>>> +            add_stats(RELEASED_SLOW_KICKED, 1);
>>>>>>>>>> +            kvm_kick_cpu(cpu);
>>>>>>>>> What about using NMI to wake sleepers? I think it was
>>>>>>>>> discussed, but
>>>>>>>>> forgot why it was dismissed.
>>>>>>>>
>>>>>>>> I think I have missed that discussion. 'll go back and check. so
>>>>>>>> what is the idea here? we can easily wake up the halted vcpus that
>>>>>>>> have interrupt disabled?
>>>>>>> We can of course. IIRC the objection was that NMI handling path
>>>>>>> is very
>>>>>>> fragile and handling NMI on each wakeup will be more expensive then
>>>>>>> waking up a guest without injecting an event, but it is still
>>>>>>> interesting
>>>>>>> to see the numbers.
>>>>>>>
>>>>>>
>>>>>> Haam, now I remember, We had tried request based mechanism. (new
>>>>>> request like REQ_UNHALT) and process that. It had worked, but had
>>>>>> some
>>>>>> complex hacks in vcpu_enter_guest to avoid guest hang in case of
>>>>>> request cleared.  So had left it there..
>>>>>>
>>>>>> https://lkml.org/lkml/2012/4/30/67
>>>>>>
>>>>>> But I do not remember performance impact though.
>>>>> No, this is something different. Wakeup with NMI does not need KVM
>>>>> changes at
>>>>> all. Instead of kvm_kick_cpu(cpu) in kvm_unlock_kick you send NMI IPI.
>>>>>
>>>>
>>>> True. It was not NMI.
>>>> just to confirm, are you talking about something like this to be
>>>> tried ?
>>>>
>>>> apic->send_IPI_mask(cpumask_of(cpu), APIC_DM_NMI);
>>>
>>> When I started benchmark, I started seeing
>>> "Dazed and confused, but trying to continue" from unknown nmi error
>>> handling.
>>> Did I miss anything (because we did not register any NMI handler)? or
>>> is it that spurious NMIs are trouble because we could get spurious NMIs
>>> if next waiter already acquired the lock.
>> There is a default NMI handler that tries to detect the reason why NMI
>> happened (which is no so easy on x86) and prints this message if it
>> fails. You need to add logic to detect spinlock slow path there. Check
>> bit in waiting_cpus for instance.
>
> aha.. Okay. will check that.

yes. Thanks.. that did the trick.

I did like below in unknown_nmi_error():
if (cpumask_test_cpu(smp_processor_id(), &waiting_cpus))
    return;

But I believe you asked NMI method only for experimental purpose to
check the upperbound. because as I doubted above, for spurious NMI
(i.e. when unlocker kicks when waiter already got the lock), we would
still hit unknown NMI error.

I had hit spurious NMI over 1656 times over entire benchmark run.
along with
INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) took too long 
to run: 24.886 msecs etc...

(and we cannot get away with that too because it means we bypass the
unknown NMI error even in genuine cases too)

Here was the result for the my dbench test( 32 core  machine with 32
vcpu guest HT off)

                  ---------- % improvement --------------
		pvspinlock      pvspin_ipi      pvpsin_nmi
dbench_1x	0.9016    	0.7442    	0.7522
dbench_2x	14.7513   	18.0164   	15.9421
dbench_3x	14.7571   	17.0793   	13.3572
dbench_4x	6.3625    	8.7897    	5.3800

So I am seeing over 2-4% improvement with IPI method.

Gleb,
  do you think the current series looks good to you? [one patch I
have resent with in_nmi() check] or do you think I have to respin the
series with IPI method etc. or is there any concerns that I have to
address. Please let me know..


PS: [Sorry for the late reply, was quickly checking whether unfair lock
with lockowner is better. it did not prove to be though. and so far
all the results are favoring this series.]

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ