[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <547CA064.5080106@suse.com>
Date: Mon, 01 Dec 2014 18:07:48 +0100
From: Juergen Gross <jgross@...e.com>
To: "Luis R. Rodriguez" <mcgrof@...e.com>,
David Vrabel <david.vrabel@...rix.com>
CC: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>,
Oleg Nesterov <oleg@...hat.com>,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
boris.ostrovsky@...cle.com, xen-devel@...ts.xenproject.org,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
x86@...nel.org, kvm@...r.kernel.org,
Davidlohr Bueso <dbueso@...e.de>,
Joerg Roedel <jroedel@...e.de>, Borislav Petkov <bp@...e.de>,
Jan Beulich <JBeulich@...e.com>, Olaf Hering <ohering@...e.de>
Subject: Re: [PATCH] xen: privcmd: schedule() after private hypercall when
non CONFIG_PREEMPT
On 12/01/2014 05:19 PM, Luis R. Rodriguez wrote:
> On Mon, Dec 01, 2014 at 03:54:24PM +0000, David Vrabel wrote:
>> On 01/12/14 15:44, Luis R. Rodriguez wrote:
>>> On Mon, Dec 1, 2014 at 10:18 AM, David Vrabel <david.vrabel@...rix.com> wrote:
>>>> On 01/12/14 15:05, Luis R. Rodriguez wrote:
>>>>> On Mon, Dec 01, 2014 at 11:11:43AM +0000, David Vrabel wrote:
>>>>>> On 27/11/14 18:36, Luis R. Rodriguez wrote:
>>>>>>> On Thu, Nov 27, 2014 at 07:36:31AM +0100, Juergen Gross wrote:
>>>>>>>> On 11/26/2014 11:26 PM, Luis R. Rodriguez wrote:
>>>>>>>>> From: "Luis R. Rodriguez" <mcgrof@...e.com>
>>>>>>>>>
>>>>>>>>> Some folks had reported that some xen hypercalls take a long time
>>>>>>>>> to complete when issued from the userspace private ioctl mechanism,
>>>>>>>>> this can happen for instance with some hypercalls that have many
>>>>>>>>> sub-operations, this can happen for instance on hypercalls that use
>>>>>> [...]
>>>>>>>>> --- a/drivers/xen/privcmd.c
>>>>>>>>> +++ b/drivers/xen/privcmd.c
>>>>>>>>> @@ -60,6 +60,9 @@ static long privcmd_ioctl_hypercall(void __user *udata)
>>>>>>>>> hypercall.arg[0], hypercall.arg[1],
>>>>>>>>> hypercall.arg[2], hypercall.arg[3],
>>>>>>>>> hypercall.arg[4]);
>>>>>>>>> +#ifndef CONFIG_PREEMPT
>>>>>>>>> + schedule();
>>>>>>>>> +#endif
>>>>>>
>>>>>> As Juergen points out, this does nothing. You need to schedule while in
>>>>>> the middle of the hypercall.
>>>>>>
>>>>>> Remember that Xen's hypercall preemption only preempts the hypercall to
>>>>>> run interrupts in the guest.
>>>>>
>>>>> How is it ensured that when the kernel preempts on this code path on
>>>>> CONFIG_PREEMPT=n kernel that only interrupts in the guest are run?
>>>>
>>>> Sorry, I really didn't describe this very well.
>>>>
>>>> If a hypercall needs a continuation, Xen returns to the guest with the
>>>> IP set to the hypercall instruction, and on the way back to the guest
>>>> Xen may schedule a different VCPU or it will do any upcalls (as per normal).
>>>>
>>>> The guest is free to return from the upcall to the original task
>>>> (continuing the hypercall) or to a different one.
>>>
>>> OK so that addresses what Xen will do when using continuation and
>>> hypercall preemption, my concern here was that using
>>> preempt_schedule_irq() on CONFIG_PREEMPT=n kernels in the middle of a
>>> hypercall on the return from an interrupt (e.g., the timer interrupt)
>>> would still let the kernel preempt to tasks other than those related
>>> to Xen.
>>
>> Um. Why would that be a problem? We do want to switch to any task the
>> Linux scheduler thinks is best.
>
> Its safe but -- it technically is doing kernel preemption, unless we want
> to adjust the definition of CONFIG_PREEMPT=n to exclude hypercalls. This
> was my original concern with the use of preempt_schedule_irq() to do this.
> I am afraid of setting precedents without being clear or wider review and
> acceptance.
I wonder whether it would be more acceptable to add (or completely
switch to) another preemption model: PREEMPT_SWITCHABLE. This would be
similar to CONFIG_PREEMPT, but the "normal" value of __preempt_count
would be settable via kernel parameter (default 2):
0: preempt
1: preempt_voluntary
2: preempt_none
The kernel would run with preemption enabled. cond_sched() would
reschedule if __preempt_count <= 1. And in case of long running kernel
activities (like the hypercall case or other stuff requiring schedule()
calls to avoid hangups) we would just set __preempt_count to 0 during
these periods and restore the old value afterwards.
This would be a rather intrusive but clean change IMO.
Any thoughts?
Juergen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists