linux-kernel - Re: [PATCH 3/3] wait: avoid spurious calls to prepare_to_wait_event() in ___wait

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAGudoHG66-gnSkyAQ4O-ez09ef-0WuvrXah-nu_j2DEXKFsDag@mail.gmail.com>
Date: Tue, 4 Mar 2025 16:25:37 +0100
From: Mateusz Guzik <mjguzik@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: torvalds@...ux-foundation.org, oleg@...hat.com, brauner@...nel.org, 
	mingo@...hat.com, rostedt@...dmis.org, linux-kernel@...r.kernel.org, 
	linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 3/3] wait: avoid spurious calls to prepare_to_wait_event()
 in ___wait_event()

On Tue, Mar 4, 2025 at 3:19 PM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Tue, Mar 04, 2025 at 12:04:09AM +0100, Mateusz Guzik wrote:
> > In vast majority of cases the condition determining whether the thread
> > can proceed is true after the first wake up.
> >
> > However, even in that case the thread ends up calling into
> > prepare_to_wait_event() again, suffering a spurious irq + lock trip.
> >
> > Then it calls into finish_wait() to unlink itself.
> >
> > Note that in case of a pending signal the work done by
> > prepare_to_wait_event() gets ignored even without the change.
> >
> > pre-check the condition after waking up instead.
> >
> > Stats gathared during a kernel build:
> > bpftrace -e 'kprobe:prepare_to_wait_event,kprobe:finish_wait \
> >                { @[probe] = count(); }'
> >
> > @[kprobe:finish_wait]: 392483
> > @[kprobe:prepare_to_wait_event]: 778690
> >
> > As in calls to prepare_to_wait_event() almost double calls to
> > finish_wait(). This evens out with the patch.
> >
> > Signed-off-by: Mateusz Guzik <mjguzik@...il.com>
> > ---
> >
> > One may worry about using "condition" twice. However, macros leading up
> > to this one already do it, so it should be fine.
> >
> > Also one may wonder about fences -- to my understanding going off and on
> > CPU guarantees a full fence, so the now avoided lock trip changes
> > nothing.
>
> so it always helps if you provide actual numbers. Supposedly this makes
> it go faster?
>
> Also, how much bytes does it add to the image?
>
> Anyway, no real objection, but it would be good to have better
> substantiation etc.
>

I did not bother benchmarking this per se, but I did demonstrate with
bpftrace above that this does avoid the irq and lock usage, which is
normally the desired outcome.

It is a fair point that in this case this is a tradeoff with i-cache footprint.

bloat-o-meter claims:
add/remove: 2/2 grow/shrink: 159/5 up/down: 3865/-250 (3615)
[..]
Total: Before=29946099, After=29949714, chg +0.01%

The biggest growth, by a large margin:
will_read_block.part                           -     183    +183
fuse_get_req                                 800     924    +124
load_module                                 9137    9255    +118
fuse_do_readfolio                            415     528    +113
fuse_page_mkwrite                            157     259    +102

On the other hand the shrinkage:
port_fops_read                               504     503      -1
__ps2_command                               1402    1401      -1
fcntl_setlk                                  961     947     -14
__pfx_fuse_wait_on_folio_writeback            16       -     -16
load_compressed_image                       4007    3974     -33
uprobe_notify_resume                        3384    3323     -61
fuse_wait_on_folio_writeback                 124       -    -124

The entire thing is attached.


--
Mateusz Guzik <mjguzik gmail.com>

View attachment "bloat.txt" of type "text/plain" (11097 bytes)