[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <gspf7guqczppgfrus5lfhinyl62xezc4h7nqcnd4m7243v4mna@hxmu2wousrh7>
Date: Wed, 25 Dec 2024 11:32:24 -0500
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Mateusz Guzik <mjguzik@...il.com>
Cc: Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
WangYuli <wangyuli@...ontech.com>, viro@...iv.linux.org.uk, brauner@...nel.org, jack@...e.cz,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org, yushengjin@...ontech.com,
zhangdandan@...ontech.com, guanwentao@...ontech.com, zhanjun@...ontech.com,
oliver.sang@...el.com, ebiederm@...ssion.com, colin.king@...onical.com,
josh@...htriplett.org, penberg@...helsinki.fi, manfred@...orfullife.com, mingo@...e.hu,
jes@....com, hch@....de, aia21@...tab.net, arjan@...radead.org,
jgarzik@...ox.com, neukum@...hschaft.cup.uni-muenchen.de, oliver@...kum.name,
dada1@...mosbay.com, axboe@...nel.dk, axboe@...e.de, nickpiggin@...oo.com.au,
dhowells@...hat.com, nathans@....com, rolandd@...co.com, tytso@....edu,
bunk@...sta.de, pbadari@...ibm.com, ak@...ux.intel.com, ak@...e.de,
davem@...emloft.net, jsipek@...sunysb.edu
Subject: Re: [RESEND PATCH] fs/pipe: Introduce a check to skip sleeping
processes during pipe read/write
On Wed, Dec 25, 2024 at 05:04:46PM +0100, Mateusz Guzik wrote:
> On Wed, Dec 25, 2024 at 08:53:05AM -0500, Kent Overstreet wrote:
> > On Wed, Dec 25, 2024 at 03:30:05PM +0200, Andy Shevchenko wrote:
> > > Don't you think the Cc list is a bit overloaded?
> >
> > Indeed, my mail server doesn't let me reply-all.
> >
> > > On Wed, Dec 25, 2024 at 05:42:02PM +0800, WangYuli wrote:
> > > > +config PIPE_SKIP_SLEEPER
> > > > + bool "Skip sleeping processes during pipe read/write"
> > > > + default n
> > >
> > > 'n' is the default 'default', no need to have this line.
> >
> > Actually, I'd say to skip the kconfig option for this. Kconfig options
> > that affect the behaviour of core code increase our testing burden, and
> > are another variable to account for when chasing down bugs, and the
> > potential overhead looks negligable.
> >
>
> I agree the behavior should not be guarded by an option. However,
> because of how wq_has_sleeper is implemented (see below) I would argue
> this needs to show how often locking can be avoided in real workloads.
>
> The commit message does state this comes with a slowdown for cases which
> can't avoid wakeups, but as is I thought the submitter just meant an
> extra branch.
>
> > Also, did you look at adding this optimization to wake_up()? No-op
> > wakeups are very common, I think this has wider applicability.
>
> I was going to suggest it myself, but then:
>
> static inline bool wq_has_sleeper(struct wait_queue_head *wq_head)
> {
> /*
> * We need to be sure we are in sync with the
> * add_wait_queue modifications to the wait queue.
> *
> * This memory barrier should be paired with one on the
> * waiting side.
> */
> smp_mb();
> return waitqueue_active(wq_head);
> }
>
> Which means this is in fact quite expensive.
>
> Since wakeup is a lock + an interrupt trip, it would still be
> cheaper single-threaded to "merely" suffer a full fence and for cases
> where the queue is empty often enough this is definitely the right thing
> to do.
We're comparing against no-op wakeup. A real wakeup does an IPI, which
completely dwarfs the cost of a barrier.
And note that wake_up() is spin_lock_irqsave(), not spin_lock(). I
assume it's gotten better, but back when I was looking at waitqueues
nested pushf/popf was horrifically expensive.
But perhaps can we do this with just a release barrier? Similar to how
list_empty_careful() works.
> On the other hand this executing when the queue is mostly *not* empty
> would combat the point.
>
> So unfortunately embedding this in wake_up is a no-go.
You definitely can't say that without knowing how often no-op
wake_up()s occur. It wouldn't be hard to gather that (write a patch to
add a pair of percpu counters, throw it on a few machines running random
workloads) and I think the results might surprise you.
Powered by blists - more mailing lists