linux-kernel - Re: numlist API Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87sgpnmqdo.fsf@linutronix.de>
Date:   Tue, 27 Aug 2019 01:57:39 +0200
From:   John Ogness <john.ogness@...utronix.de>
To:     Petr Mladek <pmladek@...e.com>
Cc:     linux-kernel@...r.kernel.org,
        Andrea Parri <andrea.parri@...rulasolutions.com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Brendan Higgins <brendanhiggins@...gle.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: numlist API Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation

On 2019-08-23, Petr Mladek <pmladek@...e.com> wrote:
>> --- /dev/null
>> +++ b/kernel/printk/numlist.c
>> @@ -0,0 +1,375 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +
>> +#include <linux/sched.h>
>> +#include "numlist.h"
>
> struct numlist is really special variant of a list. Let me to
> do a short summary:
>
>    + FIFO queue interface
>
>    + nodes sequentially numbered
>
>    + nodes referenced by ID instead pointers to avoid ABA problems
>      + requires custom node() callback to get pointer for given ID
>
>    + lockless access:
>      + pushed nodes must not longer get modified by push() caller
>      + pop() caller gets exclusive write access, except that they
>        must modify ID first and do smp_wmb() later

Only if the "numlist user" decides to recycle descriptors (which the
printk_ringbuffer does) is ID modification of descriptors necessary. How
that is synchronized with readers is up to the user (for example,
whether a RELEASE or an smp_wmb() is used).

>    + pop() does not work:
>      + tail node is "busy"
> 	+ needs a custom callback that defines when a node is busy

Note that busy() could always return false if the user has no concept of
nodes that should not be popped.

>      + tail is the last node
> 	+ needed for lockless sequential numbering
>
> I will start with one inevitable question ;-) Is it realistic to find
> another user for this API, please?

If someone needs a FIFO queue that supports:

1. multiple concurrent writers and multiple concurrent non-consuming
   readers

2. where readers are allowed to miss nodes but are able to detect how
   many were missed

3. from any context (including NMI)

then I know of no other data structure available. (Otherwise I would
have used it!)

> I am not sure that all the indirections, caused by the generic API,
> are worth the gain.

IMHO the API is sane. The only bizarre rule is that the numlist must
always have at least 1 node. But since the readers are non-consuming,
there is no real tragedy here.

My goal is not to create some fabulous abstract data structure that
everyone should use. But I did try to minimize numlist (and dataring) to
only be concerned with clearly defined and minimal responsibilities
without imposing unnecessary restrictions on the user.

> Well, the separate API makes sense anyway. I have some ideas that
> might make it cleaner.

[snipped the nice refactoring of the ID into the nl_node]

Your idea (along with previous discussions) convinced me of the
importance of moving the ID-related barriers into the same
file. However, rather than pushing the ID parts into the numlist, I will
be moving them all into the "numlist user"
(i.e. printk_ringbuffer). Your use of the ACQUIRE to load the ID made me
realize that I need to be doing that as well! (but in the node()
callback)

The reasons why I do not want the ID in nl_node is:

- The numlist would need to implement the ID-to-node mapping. For the
  printk_ringbuffer that mapping is simply masking to an index within an
  array. But why should a numlist user be forced to do it that way? I
  see no advantage to restricting numlists to being arrays of nodes.

- The dataring structure also uses IDs and requires an ID-to-node
  mapping. I do not want to bind the dataring and numlist data
  structures together at this level because they really have nothing to
  do with each other. Having the dataring and numlist ID-to-node
  mappings (and their barriers) in the same place (in the
  numlist/dataring _user_) simplifies the big picture.

- ID-related barriers are only needed if node recycling is involved. The
  numlist user decides if recycling is used and if yes, then the numlist
  user is responsible for correctly implementing that.

- By moving all the ID-related barriers to the callbacks, the numlist
  code remains clean and (with the exception of the one smp_rmb()) does
  not expect anything from the numlist user.

I believe your main concern was having easily visible symmetric
barriers. We can achieve that if the read-barriers are in the callbacks
(for both numlist and dataring). I think it makes more sense to put them
there. dataring and numlist should not care about the ID-to-node
mapping.

John Ogness