[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <453B2761.5020103@qumranet.com>
Date: Sun, 22 Oct 2006 10:10:09 +0200
From: Avi Kivity <avi@...ranet.com>
To: Anthony Liguori <aliguori@...ibm.com>
CC: Alan Cox <alan@...rguk.ukuu.org.uk>,
John Stoffel <john@...ffel.org>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/7] KVM: userspace interface
Anthony Liguori wrote:
>>>
>>> You miss my point I think. Using ioctls *requires* a thread
>>> per-vcpu in userspace. This is unnecessary since you could simply
>>> provide a char-device based read/write interface. You could then
>>> multiplex events and poll.
>>>
>>
>> Yes, ioctl()s require userspace threads, but that's okay, because
>> they're free for us, since we need a kernel thread for each vcpu.
>>
>> On the other hand, a single device model thread polling the vcpus is
>> guaranteed to be on the wrong physical cpu for half of the time
>> (assuming 2 cpus and 2 vcpus), requiring IPIs and suspending a vcpu
>> in order to run.
>
> And your previously proposed solution of having one big lock would do
> the same thing except require additional round trips to the kernel :-)
No, with no contention locks stay in userspace. And if there is
contention, we fine-grain the locks.
>
> Moreover, you could get clever and use mmap() to expose a ring queue
> if you're really concerned about SMP.
>
> Really though, it comes down to one simple thing: blocking ioctl()s
> are a real ugly interface.
>
I don't think they can be termed "blocking".
Most (all?) blocking calls offload work to some other device, like a
disk or a network card, and sleep if that device has to do any
processing. They follow the same basic procedure:
- if data (or bufferspace) is available, read (or write) it
- otherwise, sleep
But in this case the "other device" is the processor, so the that model
doesn't fit very well, as it *forces* a context switch.
Moreover, we need to both read and write, which ioctls() allow, but
read()/write() require two system calls.
>>> If for nothing else, you have to be able to run timers in userspace
>>> and interrupt the kernel execution (to signal DMA completion for
>>> instance). Even in the UP case, this gets ugly quickly.
>>>
>>
>> The timers aren't pretty (we use signals), yes. But avoiding the
>> extra thread is critical for performance IMO.
>
> We've had a lot of problems in QEMU with timers and kqemu. Forcing
> the guest to return to userspace to allow periodic timers to run
> (which may simply be the VGA refresh which the guest doesn't care
> about) is at best a hack.
You can also have an additional thread to the periodic stuff.
> Being able to poll an FD would make this so much nicer...
>
> I've posted some patches on qemu-devel attempting to deal with these
> issues (look for threads on optimizing char device performance). None
> of them are very pretty.
>
Xen is different since you already have a context switch by going to
domain 0.
--
error compiling committee.c: too many arguments to function
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists