linux-kernel - Re: [PATCH 4/8] io_uring/io-wq: cache work->flags in variable

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <856ed55d-b07b-499c-b340-2efa70c73f7a@gmail.com>
Date: Wed, 29 Jan 2025 18:57:00 +0000
From: Pavel Begunkov <asml.silence@...il.com>
To: Max Kellermann <max.kellermann@...os.com>, axboe@...nel.dk,
 io-uring@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/8] io_uring/io-wq: cache work->flags in variable

On 1/28/25 13:39, Max Kellermann wrote:
> This eliminates several redundant atomic reads and therefore reduces
> the duration the surrounding spinlocks are held.

What architecture are you running? I don't get why the reads
are expensive while it's relaxed and there shouldn't even be
any contention. It doesn't even need to be atomics, we still
should be able to convert int back to plain ints.
  
> In several io_uring benchmarks, this reduced the CPU time spent in
> queued_spin_lock_slowpath() considerably:
> 
> io_uring benchmark with a flood of `IORING_OP_NOP` and `IOSQE_ASYNC`:
> 
>      38.86%     -1.49%  [kernel.kallsyms]  [k] queued_spin_lock_slowpath
>       6.75%     +0.36%  [kernel.kallsyms]  [k] io_worker_handle_work
>       2.60%     +0.19%  [kernel.kallsyms]  [k] io_nop
>       3.92%     +0.18%  [kernel.kallsyms]  [k] io_req_task_complete
>       6.34%     -0.18%  [kernel.kallsyms]  [k] io_wq_submit_work
> 
> HTTP server, static file:
> 
>      42.79%     -2.77%  [kernel.kallsyms]     [k] queued_spin_lock_slowpath
>       2.08%     +0.23%  [kernel.kallsyms]     [k] io_wq_submit_work
>       1.19%     +0.20%  [kernel.kallsyms]     [k] amd_iommu_iotlb_sync_map
>       1.46%     +0.15%  [kernel.kallsyms]     [k] ep_poll_callback
>       1.80%     +0.15%  [kernel.kallsyms]     [k] io_worker_handle_work
> 
> HTTP server, PHP:
> 
>      35.03%     -1.80%  [kernel.kallsyms]     [k] queued_spin_lock_slowpath
>       0.84%     +0.21%  [kernel.kallsyms]     [k] amd_iommu_iotlb_sync_map
>       1.39%     +0.12%  [kernel.kallsyms]     [k] _copy_to_iter
>       0.21%     +0.10%  [kernel.kallsyms]     [k] update_sd_lb_stats
> 
> Signed-off-by: Max Kellermann <max.kellermann@...os.com>

-- 
Pavel Begunkov