[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGudoHG66-gnSkyAQ4O-ez09ef-0WuvrXah-nu_j2DEXKFsDag@mail.gmail.com>
Date: Tue, 4 Mar 2025 16:25:37 +0100
From: Mateusz Guzik <mjguzik@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: torvalds@...ux-foundation.org, oleg@...hat.com, brauner@...nel.org,
mingo@...hat.com, rostedt@...dmis.org, linux-kernel@...r.kernel.org,
linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 3/3] wait: avoid spurious calls to prepare_to_wait_event()
in ___wait_event()
On Tue, Mar 4, 2025 at 3:19 PM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Tue, Mar 04, 2025 at 12:04:09AM +0100, Mateusz Guzik wrote:
> > In vast majority of cases the condition determining whether the thread
> > can proceed is true after the first wake up.
> >
> > However, even in that case the thread ends up calling into
> > prepare_to_wait_event() again, suffering a spurious irq + lock trip.
> >
> > Then it calls into finish_wait() to unlink itself.
> >
> > Note that in case of a pending signal the work done by
> > prepare_to_wait_event() gets ignored even without the change.
> >
> > pre-check the condition after waking up instead.
> >
> > Stats gathared during a kernel build:
> > bpftrace -e 'kprobe:prepare_to_wait_event,kprobe:finish_wait \
> > { @[probe] = count(); }'
> >
> > @[kprobe:finish_wait]: 392483
> > @[kprobe:prepare_to_wait_event]: 778690
> >
> > As in calls to prepare_to_wait_event() almost double calls to
> > finish_wait(). This evens out with the patch.
> >
> > Signed-off-by: Mateusz Guzik <mjguzik@...il.com>
> > ---
> >
> > One may worry about using "condition" twice. However, macros leading up
> > to this one already do it, so it should be fine.
> >
> > Also one may wonder about fences -- to my understanding going off and on
> > CPU guarantees a full fence, so the now avoided lock trip changes
> > nothing.
>
> so it always helps if you provide actual numbers. Supposedly this makes
> it go faster?
>
> Also, how much bytes does it add to the image?
>
> Anyway, no real objection, but it would be good to have better
> substantiation etc.
>
I did not bother benchmarking this per se, but I did demonstrate with
bpftrace above that this does avoid the irq and lock usage, which is
normally the desired outcome.
It is a fair point that in this case this is a tradeoff with i-cache footprint.
bloat-o-meter claims:
add/remove: 2/2 grow/shrink: 159/5 up/down: 3865/-250 (3615)
[..]
Total: Before=29946099, After=29949714, chg +0.01%
The biggest growth, by a large margin:
will_read_block.part - 183 +183
fuse_get_req 800 924 +124
load_module 9137 9255 +118
fuse_do_readfolio 415 528 +113
fuse_page_mkwrite 157 259 +102
On the other hand the shrinkage:
port_fops_read 504 503 -1
__ps2_command 1402 1401 -1
fcntl_setlk 961 947 -14
__pfx_fuse_wait_on_folio_writeback 16 - -16
load_compressed_image 4007 3974 -33
uprobe_notify_resume 3384 3323 -61
fuse_wait_on_folio_writeback 124 - -124
The entire thing is attached.
--
Mateusz Guzik <mjguzik gmail.com>
View attachment "bloat.txt" of type "text/plain" (11097 bytes)
Powered by blists - more mailing lists