lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <30b7018d-f60d-0c57-b3d6-92c1cf0762b9@kernel.dk>
Date:   Fri, 29 Oct 2021 06:40:33 -0600
From:   Jens Axboe <axboe@...nel.dk>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     Eric Dumazet <eric.dumazet@...il.com>,
        Hao Xu <haoxu@...ux.alibaba.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [BUG] About "io_uring: add more uring info to fdinfo for debug"

On 10/28/21 6:43 PM, Eric Dumazet wrote:
> On Thu, Oct 28, 2021 at 5:13 PM Jens Axboe <axboe@...nel.dk> wrote:
>>
>> On 10/28/21 3:40 PM, Jens Axboe wrote:
>>> On 10/28/21 3:24 PM, Eric Dumazet wrote:
>>>> Hi
>>>>
>>>> I was looking at commit 83f84356bc8f2d
>>>> ("io_uring: add more uring info to fdinfo for debug") after receiving
>>>> syzbot reports.
>>>>
>>>> I suspect that the following :
>>>>
>>>> +       for (i = cached_sq_head; i < sq_tail; i++) {
>>>> +               unsigned int sq_idx = READ_ONCE(ctx->sq_array[i & sq_mask]);
>>>> +
>>>> +               if (likely(sq_idx <= sq_mask)) {
>>>> +                       struct io_uring_sqe *sqe = &ctx->sq_sqes[sq_idx];
>>>> +
>>>> +                       seq_printf(m, "%5u: opcode:%d, fd:%d, flags:%x, user_data:%llu\n",
>>>> +                                  sq_idx, sqe->opcode, sqe->fd, sqe->flags, sqe->user_data);
>>>> +               }
>>>> +       }
>>>>
>>>>
>>>> Can loop around ~2^32 times if sq_tail is close to ~0U
>>>>
>>>> I see various READ_ONCE(), which are probably not good enough.
>>>>
>>>> At very minimum I would handling wrapping...
>>>
>>> Thanks for reporting this. I think on top of wrapping, the loop should
>>> just be capped at sq_entries as well. There's no point dumping more than
>>> that, ever.
>>>
>>> I'll take a stab at this.
>>
>> I'd probably do something like this - make sure wrap is sane and that we
>> always cap at the max number of entries we expect. This doesn't quite
>> hold true for CQEs, but honestly for debugging purposes, we only really
>> care about the sq ring side in terms of stalls. Or if we have unreaped
>> CQEs, which we'll still show.
>>
>> This also removes the masking, as it's better to expose the ring indexes
>> directly. And just dump the raw ring head/tail for sq/cq. We still
>> include the cached info, but I think dumping the raw contents is saner
>> and more useful.
>>
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 17cb0e1b88f0..babd9950ae9f 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -10065,12 +10065,11 @@ static __cold void __io_uring_show_fdinfo(struct io_ring_ctx *ctx,
>>         struct io_overflow_cqe *ocqe;
>>         struct io_rings *r = ctx->rings;
>>         unsigned int sq_mask = ctx->sq_entries - 1, cq_mask = ctx->cq_entries - 1;
>> -       unsigned int cached_sq_head = ctx->cached_sq_head;
>> -       unsigned int cached_cq_tail = ctx->cached_cq_tail;
>>         unsigned int sq_head = READ_ONCE(r->sq.head);
>>         unsigned int sq_tail = READ_ONCE(r->sq.tail);
>>         unsigned int cq_head = READ_ONCE(r->cq.head);
>>         unsigned int cq_tail = READ_ONCE(r->cq.tail);
>> +       unsigned int sq_entries, cq_entries;
>>         bool has_lock;
>>         unsigned int i;
>>
>> @@ -10080,15 +10079,19 @@ static __cold void __io_uring_show_fdinfo(struct io_ring_ctx *ctx,
>>          * and sq_tail and cq_head are changed by userspace. But it's ok since
>>          * we usually use these info when it is stuck.
>>          */
>> -       seq_printf(m, "SqHead:\t%u\n", sq_head & sq_mask);
>> -       seq_printf(m, "SqTail:\t%u\n", sq_tail & sq_mask);
>> -       seq_printf(m, "CachedSqHead:\t%u\n", cached_sq_head & sq_mask);
>> -       seq_printf(m, "CqHead:\t%u\n", cq_head & cq_mask);
>> -       seq_printf(m, "CqTail:\t%u\n", cq_tail & cq_mask);
>> -       seq_printf(m, "CachedCqTail:\t%u\n", cached_cq_tail & cq_mask);
>> -       seq_printf(m, "SQEs:\t%u\n", sq_tail - cached_sq_head);
>> -       for (i = cached_sq_head; i < sq_tail; i++) {
>> -               unsigned int sq_idx = READ_ONCE(ctx->sq_array[i & sq_mask]);
>> +       seq_printf(m, "SqMask:\t\t0x%x\n", sq_mask);
>> +       seq_printf(m, "SqHead:\t%u\n", sq_head);
>> +       seq_printf(m, "SqTail:\t%u\n", sq_tail);
>> +       seq_printf(m, "CachedSqHead:\t%u\n", ctx->cached_sq_head);
>> +       seq_printf(m, "CqMask:\t0x%x\n", cq_mask);
>> +       seq_printf(m, "CqHead:\t%u\n", cq_head);
>> +       seq_printf(m, "CqTail:\t%u\n", cq_tail);
>> +       seq_printf(m, "CachedCqTail:\t%u\n", ctx->cached_cq_tail);
>> +       seq_printf(m, "SQEs:\t%u\n", sq_tail - ctx->cached_sq_head);
>> +       sq_entries = min(sq_tail - sq_head, ctx->sq_entries);
>> +       for (i = 0; i < sq_entries; i++) {
>> +               unsigned int entry = i + sq_head;
>> +               unsigned int sq_idx = READ_ONCE(ctx->sq_array[entry & sq_mask]);
>>
>>                 if (likely(sq_idx <= sq_mask)) {
>>                         struct io_uring_sqe *sqe = &ctx->sq_sqes[sq_idx];
>> @@ -10097,9 +10100,11 @@ static __cold void __io_uring_show_fdinfo(struct io_ring_ctx *ctx,
>>                                    sq_idx, sqe->opcode, sqe->fd, sqe->flags, sqe->user_data);
>>                 }
>>         }
>> -       seq_printf(m, "CQEs:\t%u\n", cached_cq_tail - cq_head);
>> -       for (i = cq_head; i < cached_cq_tail; i++) {
>> -               struct io_uring_cqe *cqe = &r->cqes[i & cq_mask];
>> +       seq_printf(m, "CQEs:\t%u\n", cq_tail - cq_head);
>> +       cq_entries = min(cq_tail - cq_head, ctx->cq_entries);
>> +       for (i = 0; i < cq_entries; i++) {
>> +               unsigned int entry = i + cq_head;
>> +               struct io_uring_cqe *cqe = &r->cqes[entry & cq_mask];
>>
>>                 seq_printf(m, "%5u: user_data:%llu, res:%d, flag:%x\n",
>>                            i & cq_mask, cqe->user_data, cqe->res, cqe->flags);
> 
> Note : you probably want to replace  (i & cq_mask) to (entry & cq_mask) here
> 
> Otherwise, patch looks good to me.

Thanks, good catch. I've changed it.

-- 
Jens Axboe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ