lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 28 Jul 2020 00:46:28 +0530
From:   Kanchan Joshi <joshiiitr@...il.com>
To:     Jens Axboe <axboe@...nel.dk>
Cc:     Kanchan Joshi <joshi.k@...sung.com>, viro@...iv.linux.org.uk,
        bcrl@...ck.org, Matthew Wilcox <willy@...radead.org>,
        Christoph Hellwig <hch@...radead.org>,
        Damien Le Moal <Damien.LeMoal@....com>,
        asml.silence@...il.com, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-aio@...ck.org,
        io-uring@...r.kernel.org, linux-block@...r.kernel.org,
        linux-api@...r.kernel.org, SelvaKumar S <selvakuma.s1@...sung.com>,
        Nitesh Shetty <nj.shetty@...sung.com>,
        Javier Gonzalez <javier.gonz@...sung.com>
Subject: Re: [PATCH v4 6/6] io_uring: add support for zone-append

On Fri, Jul 24, 2020 at 10:00 PM Jens Axboe <axboe@...nel.dk> wrote:
>
> On 7/24/20 9:49 AM, Kanchan Joshi wrote:
> > diff --git a/fs/io_uring.c b/fs/io_uring.c
> > index 7809ab2..6510cf5 100644
> > --- a/fs/io_uring.c
> > +++ b/fs/io_uring.c
> > @@ -1284,8 +1301,15 @@ static void __io_cqring_fill_event(struct io_kiocb *req, long res, long cflags)
> >       cqe = io_get_cqring(ctx);
> >       if (likely(cqe)) {
> >               WRITE_ONCE(cqe->user_data, req->user_data);
> > -             WRITE_ONCE(cqe->res, res);
> > -             WRITE_ONCE(cqe->flags, cflags);
> > +             if (unlikely(req->flags & REQ_F_ZONE_APPEND)) {
> > +                     if (likely(res > 0))
> > +                             WRITE_ONCE(cqe->res64, req->rw.append_offset);
> > +                     else
> > +                             WRITE_ONCE(cqe->res64, res);
> > +             } else {
> > +                     WRITE_ONCE(cqe->res, res);
> > +                     WRITE_ONCE(cqe->flags, cflags);
> > +             }
>
> This would be nice to keep out of the fast path, if possible.

I was thinking of keeping a function-pointer (in io_kiocb) during
submission. That would have avoided this check......but argument count
differs, so it did not add up.

> > diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
> > index 92c2269..2580d93 100644
> > --- a/include/uapi/linux/io_uring.h
> > +++ b/include/uapi/linux/io_uring.h
> > @@ -156,8 +156,13 @@ enum {
> >   */
> >  struct io_uring_cqe {
> >       __u64   user_data;      /* sqe->data submission passed back */
> > -     __s32   res;            /* result code for this event */
> > -     __u32   flags;
> > +     union {
> > +             struct {
> > +                     __s32   res;    /* result code for this event */
> > +                     __u32   flags;
> > +             };
> > +             __s64   res64;  /* appending offset for zone append */
> > +     };
> >  };
>
> Is this a compatible change, both for now but also going forward? You
> could randomly have IORING_CQE_F_BUFFER set, or any other future flags.

Sorry, I didn't quite understand the concern. CQE_F_BUFFER is not
used/set for write currently, so it looked compatible at this point.
Yes, no room for future flags for this operation.
Do you see any other way to enable this support in io-uring?

> Layout would also be different between big and little endian, so not
> even that easy to set aside a flag for this. But even if that was done,
> we'd still have this weird API where liburing or the app would need to
> distinguish this cqe from all others based on... the user_data? Hence
> liburing can't do it, only the app would be able to.
>
> Just seems like a hack to me.

Yes, only user_data to distinguish. Do liburing helpers need to look
at cqe->res (and decide something) before returning the cqe to
application?
I see that happening at once place, but not sure when it would hit
LIBURING_DATA_TIMEOUT condition.
__io_uring_peek_cqe()
{
           do {
                io_uring_for_each_cqe(ring, head, cqe)
                        break;
                if (cqe) {
                        if (cqe->user_data == LIBURING_UDATA_TIMEOUT) {
                                if (cqe->res < 0)
                                        err = cqe->res;
                                io_uring_cq_advance(ring, 1);
                                if (!err)
                                        continue;
                                cqe = NULL;
                        }
                }
                break;
        } while (1);
}



-- 
Joshi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ