linux-kernel - Re: [PATCH -next 0/2] fs/epoll: loosen irq safety when possible

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20180720134429.1ba61018934b084bb2e17bdb@linux-foundation.org>
Date:   Fri, 20 Jul 2018 13:44:29 -0700
From:   Andrew Morton <akpm@...ux-foundation.org>
To:     Davidlohr Bueso <dave@...olabs.net>
Cc:     jbaron@...mai.com, viro@...iv.linux.org.uk,
        linux-kernel@...r.kernel.org,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [PATCH -next 0/2] fs/epoll: loosen irq safety when possible

On Fri, 20 Jul 2018 13:05:59 -0700 Davidlohr Bueso <dave@...olabs.net> wrote:

> On Fri, 20 Jul 2018, Andrew Morton wrote:
> 
> >On Fri, 20 Jul 2018 10:29:54 -0700 Davidlohr Bueso <dave@...olabs.net> wrote:
> >
> >> Hi,
> >>
> >> Both patches replace saving+restoring interrupts when taking the
> >> ep->lock (now the waitqueue lock), with just disabling local irqs.
> >> This shows immediate performance benefits in patch 1 for an epoll
> >> workload running on Xen.
> >
> >I'm surprised.  Is spin_lock_irqsave() significantly more expensive
> >than spin_lock_irq()?  Relative to all the other stuff those functions
> >are doing?  If so, how come?  Some architectural thing makes
> >local_irq_save() much more costly than local_irq_disable()?
> 
> For example, if you compare x86 native_restore_fl() to xen_restore_fl(),
> the cost of Xen is much higher.
> 
> And at least considering ep_scan_ready_list(), the lock is taken/released
> twice, to deal with the ovflist when the ep->wq.lock is not held. To the
> point that it yields measurable results (see patch 1) across incremental
> thread counts.

Did you try measuring it on bare hardware?

> >
> >> The main concern we need to have with this
> >> sort of changes in epoll is the ep_poll_callback() which is passed
> >> to the wait queue wakeup and is done very often under irq context,
> >> this patch does not touch this call.
> >
> >Yeah, these changes are scary.  For the code as it stands now, and for
> >the code as it evolves.
> 
> Yes which is why I've been throwing lots of epoll workloads at it.

I'm sure.  It's the "as it evolves" that is worrisome, and has caught
us in the past.

> >
> >I'd have more confidence if we had some warning mechanism if we run
> >spin_lock_irq() when IRQs are disabled, which is probably-a-bug.  But
> >afaict we don't have that.  Probably for good reasons - I wonder what
> >they are?

Well ignored ;)

We could open-code it locally.  Add a couple of
WARN_ON_ONCE(irqs_disabled())?  That might need re-benchmarking with
Xen but surely just reading the thing isn't too expensive?