linux-kernel - Re: [PATCH -next 0/2] fs/epoll: loosen irq safety when possible

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180720200559.27nc7j2rrxpy5p3n@linux-r8p5>
Date:   Fri, 20 Jul 2018 13:05:59 -0700
From:   Davidlohr Bueso <dave@...olabs.net>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     jbaron@...mai.com, viro@...iv.linux.org.uk,
        linux-kernel@...r.kernel.org,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [PATCH -next 0/2] fs/epoll: loosen irq safety when possible

On Fri, 20 Jul 2018, Andrew Morton wrote:

>On Fri, 20 Jul 2018 10:29:54 -0700 Davidlohr Bueso <dave@...olabs.net> wrote:
>
>> Hi,
>>
>> Both patches replace saving+restoring interrupts when taking the
>> ep->lock (now the waitqueue lock), with just disabling local irqs.
>> This shows immediate performance benefits in patch 1 for an epoll
>> workload running on Xen.
>
>I'm surprised.  Is spin_lock_irqsave() significantly more expensive
>than spin_lock_irq()?  Relative to all the other stuff those functions
>are doing?  If so, how come?  Some architectural thing makes
>local_irq_save() much more costly than local_irq_disable()?

For example, if you compare x86 native_restore_fl() to xen_restore_fl(),
the cost of Xen is much higher.

And at least considering ep_scan_ready_list(), the lock is taken/released
twice, to deal with the ovflist when the ep->wq.lock is not held. To the
point that it yields measurable results (see patch 1) across incremental
thread counts.

>
>> The main concern we need to have with this
>> sort of changes in epoll is the ep_poll_callback() which is passed
>> to the wait queue wakeup and is done very often under irq context,
>> this patch does not touch this call.
>
>Yeah, these changes are scary.  For the code as it stands now, and for
>the code as it evolves.

Yes which is why I've been throwing lots of epoll workloads at it.

>
>I'd have more confidence if we had some warning mechanism if we run
>spin_lock_irq() when IRQs are disabled, which is probably-a-bug.  But
>afaict we don't have that.  Probably for good reasons - I wonder what
>they are?
>
>> Patches have been tested pretty heavily with the customer workload,
>> microbenchmarks, ltp testcases and two high level workloads that
>> use epoll under the hood: nginx and libevent benchmarks.
>>
>> Details are in the individual patches.
>>
>> Applies on top of mmotd.
>
>Please convince me about the performance benefits?

As for number I only have patch 1.

Thanks,
Davidlohr