lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <273f154a-4cbd-2412-d056-a31fab5368d3@gmail.com> Date: Fri, 21 Oct 2022 12:09:09 +0100 From: Pavel Begunkov <asml.silence@...il.com> To: Stefan Metzmacher <metze@...ba.org>, io-uring <io-uring@...r.kernel.org>, Jens Axboe <axboe@...nel.dk> Cc: Jakub Kicinski <kuba@...nel.org>, netdev <netdev@...r.kernel.org>, Dylan Yudaken <dylany@...com> Subject: Re: IORING_SEND_NOTIF_REPORT_USAGE (was Re: IORING_CQE_F_COPIED) On 10/21/22 10:36, Stefan Metzmacher wrote: > Hi Pavel, [...] >> Right, I'm just tired of back porting patches by hand :) > > ok, I just assumed it would be 6.1 only. I'm fine with 6.1 only, it'd make things easier. I thought from your first postings you wanted it 6.0. Then we don't need to care about the placing of the copied/used flags. >>> Otherwise we could have IORING_CQE_F_COPIED by default without opt-in >>> flag... > > Do you still want an opt-in flag to get IORING_CQE_F_COPIED? > If so what name do you want it to be? Ala a IORING_SEND_* flag? Yes please. *_REPORT_USAGE was fine but I'd make it IORING_SEND_ZC_REPORT_USAGE. And can be extended if there is more info needed in the future. And I don't mind using a bit in cqe->res, makes cflags less polluted. >>>>> +static void io_uring_tx_zerocopy_callback_report_usage(struct sk_buff *skb, >>>>> + struct ubuf_info *uarg, >>>>> + bool success) >>>>> +{ >>>>> + struct io_notif_data *nd = container_of(uarg, struct io_notif_data, uarg); >>>>> + >>>>> + if (success && !nd->zc_used && skb) >>>>> + nd->zc_used = true; >>>>> + else if (unlikely(!success && !nd->zc_copied)) >>>>> + nd->zc_copied = true; >>>> >>>> It's fine but racy, so let's WRITE_ONCE() to indicate it. >>> >>> I don't see how this could be a problem, but I can add it. >> >> It's not a problem, but better to be a little be more explicit >> about parallel writes. > > ok. > >>>>> diff --git a/io_uring/notif.h b/io_uring/notif.h >>>>> index 5b4d710c8ca5..5ac7a2745e52 100644 >>>>> --- a/io_uring/notif.h >>>>> +++ b/io_uring/notif.h >>>>> @@ -13,10 +13,12 @@ struct io_notif_data { >>>>> struct file *file; >>>>> struct ubuf_info uarg; >>>>> unsigned long account_pages; >>>>> + bool zc_used; >>>>> + bool zc_copied; >>>> >>>> IIRC io_notif_data is fully packed in 6.1, so placing zc_{used,copied} >>>> there might complicate backporting (if any). We can place them in io_kiocb >>>> directly and move in 6.2. Alternatively account_pages doesn't have to be >>>> long. >>> >>> As far as I can see kernel-dk-block/io_uring-6.1 alread has your >>> shrink patches included... >> >> Sorry, I mean 6.0 > > So you want to backport to 6.0? > > Find the current version below, sizeof(struct io_kiocb) will grow from > 3*64 + 24 to 3*64 + 32 (on x64_64) to it stays within 4 cache lines. > > I tried this first: > > union { > u8 iopoll_completed; > struct { > u8 zc_used:1; > u8 zc_copied:1; > }; > }; > > But then WRITE_ONCE() complains about a bitfield write. rightfully so, it can't be a bitfield as it would be racy and not only in theory this time. > So let me now about the opt-in flag and I'll prepare real commits > including a patch that moves from struct io_kiocb to struct io_notif_data > on top. Yeah, better to be opt-in, but apart from it and comments above looks good. > diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h > index f5b687a787a3..189152ad78d6 100644 > --- a/include/linux/io_uring_types.h > +++ b/include/linux/io_uring_types.h > @@ -515,6 +515,9 @@ struct io_kiocb { > u8 opcode; > /* polled IO has completed */ > u8 iopoll_completed; > + /* these will be moved to struct io_notif_data in 6.1 */ > + bool zc_used; > + bool zc_copied; > /* > * Can be either a fixed buffer index, or used with provided buffers. > * For the latter, before issue it points to the buffer group ID, > diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h > index ab7458033ee3..738d6234d1d9 100644 > --- a/include/uapi/linux/io_uring.h > +++ b/include/uapi/linux/io_uring.h > @@ -350,6 +350,7 @@ struct io_uring_cqe { > #define IORING_CQE_F_MORE (1U << 1) > #define IORING_CQE_F_SOCK_NONEMPTY (1U << 2) > #define IORING_CQE_F_NOTIF (1U << 3) > +#define IORING_CQE_F_COPIED (1U << 4) > > enum { > IORING_CQE_BUFFER_SHIFT = 16, > diff --git a/io_uring/notif.c b/io_uring/notif.c > index e37c6569d82e..033aca064b10 100644 > --- a/io_uring/notif.c > +++ b/io_uring/notif.c > @@ -18,6 +18,10 @@ static void __io_notif_complete_tw(struct io_kiocb *notif, bool *locked) > __io_unaccount_mem(ctx->user, nd->account_pages); > nd->account_pages = 0; > } > + > + if (notif->zc_copied || !notif->zc_used) > + notif->cqe.flags |= IORING_CQE_F_COPIED; > + As discussed above, should depend on IORING_SEND_ZC_REPORT_USAGE > io_req_task_complete(notif, locked); > } > > @@ -28,6 +32,11 @@ static void io_uring_tx_zerocopy_callback(struct sk_buff *skb, > struct io_notif_data *nd = container_of(uarg, struct io_notif_data, uarg); > struct io_kiocb *notif = cmd_to_io_kiocb(nd); > > + if (success && !notif->zc_used && skb) > + WRITE_ONCE(notif->zc_used, true); > + else if (!success && !notif->zc_copied) > + WRITE_ONCE(notif->zc_copied, true); > + > if (refcount_dec_and_test(&uarg->refcnt)) { > notif->io_task_work.func = __io_notif_complete_tw; > io_req_task_work_add(notif); > @@ -55,6 +64,7 @@ struct io_kiocb *io_alloc_notif(struct io_ring_ctx *ctx) > nd->account_pages = 0; > nd->uarg.flags = SKBFL_ZEROCOPY_FRAG | SKBFL_DONT_ORPHAN; > nd->uarg.callback = io_uring_tx_zerocopy_callback; > + notif->zc_used = notif->zc_copied = false; > refcount_set(&nd->uarg.refcnt, 1); > return notif; > } > -- Pavel Begunkov
Powered by blists - more mailing lists