linux-kernel - Re: [RFC PATCH printk v1] printk: ringbuffer: Do not skip non-finalized with prb_next

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <ZUUYiYpCecPoI8mv@alley>
Date:   Fri, 3 Nov 2023 16:58:01 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     John Ogness <john.ogness@...utronix.de>
Cc:     Sergey Senozhatsky <senozhatsky@...omium.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org, Mukesh Ojha <quic_mojha@...cinc.com>,
        Chunlei Wang <chunlei.wang@...iatek.com>
Subject: Re: [RFC PATCH printk v1] printk: ringbuffer: Do not skip
 non-finalized with prb_next_seq()

On Fri 2023-11-03 14:37:23, John Ogness wrote:
> On 2023-11-03, Petr Mladek <pmladek@...e.com> wrote:
> >> Generally we have not concerned ourselves with readers. But I agree we
> >> should make the optimization coherent with what a reader can actually
> >> read. It might save some CPU cycles for polling tasks.
> >
> > I wanted to agree. But then I found this scenario:
> >
> > CPU0				CPU1
> >
> > console_unlock()
> >   console_flush_all()
> >
> > 				printk()
> > 				  vprintk_store()
> >     return;
> > 				    prb_final_commit;
> >
> > 				  console_trylock();  # failed

      __console_unlock();

    while (prb_read_valid() || console_trylock();

I added the __console_unlock() and console_trylock()
to have the full picture.

> >
> > Now, the race:
> >
> >   + console_flush_all() did not flush the message from CPU1 because
> >     it was not finalized in time.
> >
> >   + CPU1 failed to get console_lock() => CPU0 is responsible for
> >     flushing
> >
> >   + prb_read_valid() failed on CPU0 because it did not see
> >     the prb_desc finalized (missing barrier).
> 
> For semaphores, up() and down_trylock() successfully take and release a
> raw spin lock. That provides the necessary barriers so that CPU0 sees
> the record that CPU1 finalized.

I see. Hmm, we should document this. The spinlock is an implementaion
detail.

IMHO, up()/down()/down_trylock() are quite special. I think
that the theory is that lock() and unlock() are one-way barriers.
And trylock() is one-way barrier only when it succeds.

By other words, _trylock() on CPU1 normally would not guarantee
that CPU0 would see the finalized record after unlock(). [*]


Maybe, we could rely on the existing behavior but we should at least
document it.

Note that I thouht many times about using spin_trylock() in
down_trylock(). It would make more sense because trylock()
should not be be blocking operations. But I see now,
that it might break users depending on the full barrier.


Note: It is possible that I get it wrong. It is so easy to get lost
      in barriers.


> >> Writing and reading of @last_finalized_seq will provide the necessary
> >> boundaries to guarantee this:
> >> 
> >> ...finalize record...
> >> atomic_long_try_cmpxchg_release(&desc_ring->last_finalized_seq, ...);
> >> 
> >> and
> >> 
> >> atomic_long_read_acquire(&desc_ring->last_finalized_seq);
> >> ...read record...
> >
> > Yup. something like this.
> >
> > Well, it is suspicious that there is no _release() counter part.
> 
> Take a closer look above. The cmpxchg (on the writer side) does the
> release. I have the litmus tests to verify that is correct and
> sufficient for what we want: to guarantee that for any read
> @last_finalized_seq value, the CPU can also read the associated record.

The barrier is only in _prb_commit() which sets the committed state.

desc_make_final() uses cmpxchg_relaxed() so that it is not guaranteed
that other CPUs would see the "finalized" state.

> I am finalizing a new version of the "fix console flushing on panic"
> series [0] that will also include the prb_next_seq() fix. If needed, we
> can continue this discussion based on the new code.

OK, maybe it would look different in the new version. And it might
be better to continue the discussion on the new code.

Best Regards,
Petr