linux-kernel - Re: numlist_push() barriers Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190827074057.2qybpmvfueub3ztl@pathway.suse.cz>
Date:   Tue, 27 Aug 2019 09:40:57 +0200
From:   Petr Mladek <pmladek@...e.com>
To:     John Ogness <john.ogness@...utronix.de>
Cc:     Andrea Parri <andrea.parri@...rulasolutions.com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Brendan Higgins <brendanhiggins@...gle.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-kernel@...r.kernel.org
Subject: Re: numlist_push() barriers Re: [RFC PATCH v4 1/9] printk-rb: add a
 new printk ringbuffer implementation

On Tue 2019-08-27 00:36:18, John Ogness wrote:
> Hi Petr,
> 
> AndreaP responded with some explanation (and great links!) on the topic
> of READ_ONCE. But I feel like your comments about the WRITE_ONCE were
> not addressed. I address that (and your other comments) below...
> 
> On 2019-08-23, Petr Mladek <pmladek@...e.com> wrote:
> >> --- /dev/null
> >> +++ b/kernel/printk/numlist.c
> >> +/**
> >> + * numlist_push() - Add a node to the list and assign it a sequence number.
> >> + *
> >> + * @nl: The numbered list to push to.
> >> + *
> >> + * @n:  A node to push to the numbered list.
> >> + *      The node must not already be part of a list.
> >> + *
> >> + * @id: The ID of the node.
> >> + *
> >> + * A node is added in two steps: The first step is to make this node the
> >> + * head, which causes a following push to add to this node. The second step is
> >> + * to update @next_id of the former head node to point to this one, which
> >> + * makes this node visible to any task that sees the former head node.
> >> + */
> >> +void numlist_push(struct numlist *nl, struct nl_node *n, unsigned long id)
> >> +{
[...]
> >> +
> >> +	/* bB: #1 */
> >> +	head_id = atomic_long_read(&nl->head_id);
> >> +
> >> +	for (;;) {
> >> +		/* bC: */
> >> +		while (!numlist_read(nl, head_id, &seq, NULL)) {
> >> +			/*
> >> +			 * @head_id is invalid. Try again with an
> >> +			 * updated value.
> >> +			 */
> >> +
> >> +			cpu_relax();
> >
> > I have got very confused by this. cpu_relax() suggests that this
> > cycle is busy waiting until a particular node becomes valid.
> > My first though was that it must cause deadlock in NMI when
> > the interrupted code is supposed to make the node valid.
> >
> > But it is the other way. The head is always valid when it is
> > added to the list. It might become invalid when another CPU
> > moves the head and the old one gets reused.
> >
> > Anyway, I do not see any reason for cpu_relax() here.
> 
> You are correct. The cpu_relax() should not be there. But there is still
> an issue that this could spin hard if the head was recycled and this CPU
> does not yet see the new head value.

I do not understand this. The head could get reused only after
head_id was replaced with the following valid node.
The next cycle is done with a new id that should be valid.

Of course, the new ID might get reused as well. But then we just
repeat the cycle. We have to be able to find a valid head after
few cycles. The last valid ID could not get reused because nodes
can be removed only if was not the last valid node.


> To handle that, and in preparation for my next version, I'm now using a
> read_acquire() to load the ID in the node() callback (matching the
> set_release() in assign_desc()). This ensures that if numlist_read()
> fails, the new head will be visible.

I do not understand this either. The above paragraph seems to
describe a race. I do not see how it could cause an infinite loop.

Best Regards,
Petr