lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 01 Mar 2022 15:06:48 -0500
From:   Olivier Langlois <olivier@...llion01.com>
To:     Hao Xu <haoxu@...ux.alibaba.com>, Jens Axboe <axboe@...nel.dk>,
        Pavel Begunkov <asml.silence@...il.com>
Cc:     io-uring <io-uring@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4 2/2] io_uring: Add support for napi_busy_poll

On Wed, 2022-03-02 at 02:31 +0800, Hao Xu wrote:
> 
> > +       ne = kmalloc(sizeof(*ne), GFP_NOWAIT);
> > +       if (!ne)
> > +               goto out;
> 
> IMHO, we need to handle -ENOMEM here, I cut off the error handling
> when
> 
> I did the quick coding. Sorry for misleading.

If you are correct, I would be shocked about this.

I did return in my 'Linux Device Drivers' book and nowhere it is
mentionned that the kmalloc() can return something else than a pointer

No mention at all about the return value

in man page:
https://www.kernel.org/doc/htmldocs/kernel-api/API-kmalloc.html
API doc:

https://www.kernel.org/doc/html/latest/core-api/mm-api.html?highlight=kmalloc#c.kmalloc

header file:
https://elixir.bootlin.com/linux/latest/source/include/linux/slab.h#L522

I did browse into the kmalloc code. There is a lot of paths to cover
but from preliminary reading, it pretty much seems that kmalloc only
returns a valid pointer or NULL...

/**
 * kmem_cache_alloc - Allocate an object
 * @cachep: The cache to allocate from.
 * @flags: See kmalloc().
 *
 * Allocate an object from this cache.  The flags are only relevant
 * if the cache has no available objects.
 *
 * Return: pointer to the new object or %NULL in case of error
 */
 
 /**
 * __do_kmalloc - allocate memory
 * @size: how many bytes of memory are required.
 * @flags: the type of memory to allocate (see kmalloc).
 * @caller: function caller for debug tracking of the caller
 *
 * Return: pointer to the allocated memory or %NULL in case of error
 */

I'll need someone else to confirm about possible kmalloc() return
values with perhaps an example

I am a bit skeptic that something special needs to be done here...

Or perhaps you are suggesting that io_add_napi() returns an error code
when allocation fails.

as done here:
https://elixir.bootlin.com/linux/latest/source/arch/alpha/kernel/core_marvel.c#L867

If that is what you suggest, what would this info do for the caller?

IMHO, it wouldn't help in any way...
> 
> > 
> > @@ -7519,7 +7633,11 @@ static int __io_sq_thread(struct io_ring_ctx
> > *ctx, bool cap_entries)
> >                     !(ctx->flags & IORING_SETUP_R_DISABLED))
> >                         ret = io_submit_sqes(ctx, to_submit);
> >                 mutex_unlock(&ctx->uring_lock);
> > -
> > +#ifdef CONFIG_NET_RX_BUSY_POLL
> > +               if (!list_empty(&ctx->napi_list) &&
> > +                   io_napi_busy_loop(&ctx->napi_list))
> 
> I'm afraid we may need lock for sqpoll too, since io_add_napi() could
> be 
> in iowq context.
> 
> I'll take a look at the lock stuff of this patch tomorrow, too late
> now 
> in my timezone.

Ok, please do. I'm not a big user of io workers. I may have omitted to
consider this possibility.

If that is the case, I think that this would be very easy to fix by
locking the spinlock while __io_sq_thread() is using napi_list.
> 
> How about:
> 
> if (list is singular) {
> 
>      do something;
> 
>      return;
> 
> }
> 
> while (!io_busy_loop_end() && io_napi_busy_loop())
> 
>      ;
> 

is there a concern with the current code?
What would be the benefit of your suggestion over current code?

To me, it seems that if io_blocking_napi_busy_loop() is called, a
reasonable expectation would be that some busy looping is done or else
you could return the function without doing anything which would, IMHO,
be misleading.

By definition, napi_busy_loop() is not blocking and if you desire the
device to be in busy poll mode, you need to do it once in a while or
else, after a certain time, the device will return back to its
interrupt mode.

IOW, io_blocking_napi_busy_loop() follows the same logic used by
napi_busy_loop() that does not call loop_end() before having perform 1
loop iteration.

> Btw, start_time seems not used in singular branch.

I know. This is why it is conditionally initialized.

Greetings,

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ