lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A39415C.9060803@novell.com>
Date:	Wed, 17 Jun 2009 15:17:48 -0400
From:	Gregory Haskins <ghaskins@...ell.com>
To:	Davide Libenzi <davidel@...ilserver.org>
CC:	Gregory Haskins <ghaskins@...ell.com>,
	"Michael S. Tsirkin" <mst@...hat.com>, kvm@...r.kernel.org,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	avi@...hat.com, paulmck@...ux.vnet.ibm.com,
	Ingo Molnar <mingo@...e.hu>
Subject: Re: [KVM-RFC PATCH 1/2] eventfd: add an explicit srcu based notifier
 interface

Davide Libenzi wrote:
> On Wed, 17 Jun 2009, Gregory Haskins wrote:
>
>   
>> Can you elaborate?  I currently do not see how I could do the proposed
>> concept inside of irqfd while still using eventfd.  Of course, that
>> would be possible if we fork irqfd from eventfd,  and perhaps this is
>> what you are proposing.  As previously stated I don't want to give up on
>> the prospect of re-using it quite yet, so bear with me. :)
>>
>> The issue with eventfd, as I see it, is that eventfd uses a
>> spin_lock_irqsave (by virtue of the wait-queue stuff) across the
>> "signal" callback (which today is implemented as a wake-up).  This
>> spin_lock implicitly creates a non-preemptible critical section that
>> occurs independently of whether eventfd_signal() itself is invoked from
>> a sleepable context or not.
>>
>> What I strive to achieve is to remove the creation of this internal
>> critical section.  If eventfd_signal() is called from atomic context, so
>> be it.  We will detect this in the callback and be forced to take the
>> slow-path, and I am ok with that.  *But*, if eventfd_signal() (or
>> f_ops->write(), for that matter) are called from a sleepable context
>> *and* eventfd doesn't introduce its own critical section (such as with
>> my srcu patch), we can potentially optimize within the callback by
>> executing serially instead of deferring (e.g. via a workqueue).
>>     
>
> Since when the scheduling (assuming it's not permanently running on 
> another core due to high frequency work post) of a kernel thread is such 
> a big impact that interfaces need to be redesigned for that?
>   

Low-latency applications, for instance, care about this.  Even one
context switch can add a substantial overhead.

> How much the (possible, but not certain) kernel thread context switch time 
> weighs in the overall KVM IRQ service time?
>   

Generally each one is costing me about 7us on average.  For something
like high-speed networking, we have a path that has about 30us of
base-line overhead.  So one additional ctx-switch puts me at base+7 ( =
~37us), two puts me in base+2*7 (= ~44us).  So in that context (no pun
intended ;), it hurts quite a bit.  I'll be the first to admit that not
everyone (most?) will care about latency, though.  But FWIW, I do.

>
>
>   
>> It can!  :) This is not changing from whats in mainline today (covered
>> above).
>>     
>
> It can/could, if the signal() function takes very accurate care of doing 
> the magic atomic check.
>   

True, but thats the notifiee's burden, not eventfd's.  And its always
going to be opt-in.  Even today, someone is free to either try to sleep
(which will oops on the might_sleep()), or try to check if they can
sleep (it will always say they can't due to the eventfd wqh spinlock). 
The only thing that changes with my patch is that someone that opts in
to check if they can sleep may find that they sometimes (mostly?) can.

In any case, I suspect that there are no other clients of eventfd other
than standard wait-queue sleepers, and irqfd.  The wake_up() code is
never going to care anyway, so this really comes down to future users of
the notification interfaces (irqfd today, iosignalfd in the near-term). 
Therefore, there really aren't any legacy issues to deal with that I am
aware of, even if they mattered.

Thanks Davide,
-Greg


Download attachment "signature.asc" of type "application/pgp-signature" (267 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ