linux-kernel - Re: [PATCH 1/7] KVM: userspace interface

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <453B2761.5020103@qumranet.com>
Date:	Sun, 22 Oct 2006 10:10:09 +0200
From:	Avi Kivity <avi@...ranet.com>
To:	Anthony Liguori <aliguori@...ibm.com>
CC:	Alan Cox <alan@...rguk.ukuu.org.uk>,
	John Stoffel <john@...ffel.org>,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/7] KVM: userspace interface

Anthony Liguori wrote:
>>>
>>> You miss my point I think.  Using ioctls *requires* a thread 
>>> per-vcpu in userspace.  This is unnecessary since you could simply 
>>> provide a char-device based read/write interface.  You could then 
>>> multiplex events and poll.
>>>
>>
>> Yes, ioctl()s require userspace threads, but that's okay, because 
>> they're free for us, since we need a kernel thread for each vcpu.
>>
>> On the other hand, a single device model thread polling the vcpus is 
>> guaranteed to be on the wrong physical cpu for half of the time 
>> (assuming 2 cpus and 2 vcpus), requiring IPIs and suspending a vcpu 
>> in order to run.
>
> And your previously proposed solution of having one big lock would do 
> the same thing except require additional round trips to the kernel :-)

No, with no contention locks stay in userspace.  And if there is 
contention, we fine-grain the locks.

>
> Moreover, you could get clever and use mmap() to expose a ring queue 
> if you're really concerned about SMP.
>
> Really though, it comes down to one simple thing: blocking ioctl()s 
> are a real ugly interface.
>

I don't think they can be termed "blocking".

Most (all?) blocking calls offload work to some other device, like a 
disk or a network card, and sleep if that device has to do any 
processing.  They follow the same basic procedure:

- if data (or bufferspace) is available, read (or write) it
- otherwise, sleep

But in this case the "other device" is the processor, so the that model 
doesn't fit very well, as it *forces* a context switch.

Moreover, we need to both read and write, which ioctls() allow, but 
read()/write() require two system calls.

>>> If for nothing else, you have to be able to run timers in userspace 
>>> and interrupt the kernel execution (to signal DMA completion for 
>>> instance).  Even in the UP case, this gets ugly quickly.
>>>
>>
>> The timers aren't pretty (we use signals), yes.  But avoiding the 
>> extra thread is critical for performance IMO.
>
> We've had a lot of problems in QEMU with timers and kqemu.  Forcing 
> the guest to return to userspace to allow periodic timers to run 
> (which may simply be the VGA refresh which the guest doesn't care 
> about) is at best a hack.

You can also have an additional thread to the periodic stuff.

>   Being able to poll an FD would make this so much nicer...
>
> I've posted some patches on qemu-devel attempting to deal with these 
> issues (look for threads on optimizing char device performance).  None 
> of them are very pretty.
>

Xen is different since you already have a context switch by going to 
domain 0.

-- 
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/