[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f60116a5-8c35-4389-bbb6-7bf6deaf71c6@gmail.com>
Date: Thu, 24 Oct 2024 15:26:46 +0100
From: Pavel Begunkov <asml.silence@...il.com>
To: Jens Axboe <axboe@...nel.dk>, hexue <xue01.he@...sung.com>
Cc: io-uring@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v8] io_uring: releasing CPU resources when polling
On 10/24/24 15:18, Jens Axboe wrote:
> On 10/23/24 8:38 PM, hexue wrote:
>> On 9/25/2024 12:12, Pavel Begunkov wrote:
...
>> When the number of threads exceeds the number of CPU cores,the
>> database throughput does not increase significantly. However,
>> hybrid polling can releasing some CPU resources during the polling
>> process, so that part of the CPU time can be used for frequent
>> data processing and other operations, which speeds up the reading
>> process, thereby improving throughput and optimizaing database
>> performance.I tried different compression strategies and got
>> results similar to the above table.(~30% throughput improvement)
>>
>> As more database applications adapt to the io_uring engine, I think
>> the application of hybrid poll may have potential in some scenarios.
>
> Thanks for posting some numbers on that part, that's useful. I do
> think the feature is useful as well, but I still have some issues
> with the implementation. Below is an incremental patch on top of
> yours to resolve some of those, potentially. Issues:
>
> 1) The patch still reads a bit like a hack, in that it doesn't seem to
> care about following the current style. This reads a bit lazy/sloppy
> or unfinished. I've fixed that up.
>
> 2) Appropriate member and function naming.
>
> 3) Same as above, it doesn't care about proper placement of additions to
> structs. Again this is a bit lazy and wasteful, attention should be
> paid to where additions are placed to not needlessly bloat
> structures, or place members in cache unfortunate locations. For
> example, available_time is just placed at the end of io_ring_ctx,
> why? It's a submission side member, and there's room with other
> related members. Not only is the placement now where you'd want it to
> be, memory wise, it also doesn't add 8 bytes to io_uring_ctx.
>
> 4) Like the above, the io_kiocb bloat is, by far, the worst. Seems to me
> that we can share space with the polling hash_node. This obviously
> needs careful vetting, haven't done that yet. IOPOLL setups should
> not be using poll at all. This needs extra checking. The poll_state
> can go with cancel_seq_set, as there's a hole there any. And just
> like that, rather than add 24b to io_kiocb, it doesn't take any extra
> space at all.
>
> 5) HY_POLL is a terrible name. It's associated with IOPOLL, and so let's
> please use a name related to that. And require IOPOLL being set with
> HYBRID_IOPOLL, as it's really a variant of that. Makes it clear that
> HYBRID_IOPOLL is really just a mode of operation for IOPOLL, and it
> can't exist without that.
>
> Please take a look at this incremental and test it, and then post a v9
> that looks a lot more finished. Caveat - I haven't tested this one at
> all. Thanks!
>
> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
> index c79ee9fe86d4..6cf6a45835e5 100644
> --- a/include/linux/io_uring_types.h
> +++ b/include/linux/io_uring_types.h
> @@ -238,6 +238,8 @@ struct io_ring_ctx {
> struct io_rings *rings;
> struct percpu_ref refs;
>
> + u64 poll_available_time;
> +
> clockid_t clockid;
> enum tk_offsets clock_offset;
>
> @@ -433,9 +435,6 @@ struct io_ring_ctx {
> struct page **sqe_pages;
>
> struct page **cq_wait_page;
> -
> - /* for io_uring hybrid poll*/
> - u64 available_time;
> };
>
> struct io_tw_state {
> @@ -647,9 +646,22 @@ struct io_kiocb {
>
> atomic_t refs;
> bool cancel_seq_set;
> + bool poll_state;
As mentioned briefly before, that can be just a req->flags flag
> struct io_task_work io_task_work;
> - /* for polled requests, i.e. IORING_OP_POLL_ADD and async armed poll */
> - struct hlist_node hash_node;
> + union {
> + /*
> + * for polled requests, i.e. IORING_OP_POLL_ADD and async armed
> + * poll
> + */
> + struct hlist_node hash_node;
> + /*
> + * For IOPOLL setup queues, with hybrid polling
> + */
> + struct {
> + u64 iopoll_start;
> + u64 iopoll_end;
And IIRC it doesn't need to store the end as it's used immediately
after it's set in the same function.
> + };
> + };
> /* internal polling, see IORING_FEAT_FAST_POLL */
> struct async_poll *apoll;
> /* opcode allocated if it needs to store data for async defer */
> @@ -665,10 +677,6 @@ struct io_kiocb {
> u64 extra1;
> u64 extra2;
> } big_cqe;
> - /* for io_uring hybrid iopoll */
> - bool poll_state;
> - u64 iopoll_start;
> - u64 iopoll_end;
> };
>
...
--
Pavel Begunkov
Powered by blists - more mailing lists