linux-kernel - Re: [PATCH printk v3 04/14] printk: ringbuffer: Do not skip non-finalized records with prb_next

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <871q9rp2lx.fsf@jogness.linutronix.de>
Date: Mon, 05 Feb 2024 12:39:30 +0106
From: John Ogness <john.ogness@...utronix.de>
To: Petr Mladek <pmladek@...e.com>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>, Steven Rostedt
 <rostedt@...dmis.org>, Thomas Gleixner <tglx@...utronix.de>,
 linux-kernel@...r.kernel.org, Mukesh Ojha <quic_mojha@...cinc.com>
Subject: Re: [PATCH printk v3 04/14] printk: ringbuffer: Do not skip
 non-finalized records with prb_next_seq()

On 2024-01-15, Petr Mladek <pmladek@...e.com> wrote:
>> The acquire is with @last_finalized_seq. So the release must also be
>> with @last_finalized_seq. The important thing is that the CPU that
>> updates @last_finalized_seq has actually read the corresponding
>> record beforehand. That is exactly what desc_update_last_finalized()
>> does.
>
> I probably did not describe it well. The CPU updating
> @last_finalized_seq does the right thing. I was not sure about the CPU
> which reads @last_finalized_seq via prb_next_seq().
>
> To make it more clear:
>
> u64 prb_next_seq(struct printk_ringbuffer *rb)
> {
> 	u64 seq;
>
> 	seq = desc_last_finalized_seq(rb);
> 	      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 	      |
> 	      `-> This includes atomic_long_read_acquire(last_finalized_seq)
>
>
> 	if (seq != 0)
> 		seq++;
>
> 	while (_prb_read_valid(rb, &seq, NULL, NULL))
> 		seq++;
>
> 	return seq;
> }
>
> But where is the atomic_long_read_release(last_finalized_seq) in
> this code path?

read_release? The counterpart of this load_acquire is a
store_release. For example:

CPU0                     CPU1
====                     ====
load(varA)
store_release(varB)      load_acquire(varB)
                         load(varA)

If CPU1 reads the value in varB that CPU0 stored, then it is guaranteed
that CPU1 will read the value (or a later value) in varA that CPU0 read.

Translating the above example to this particular patch, we have:

CPU0: desc_update_last_finalized()       CPU1: prb_next_seq()
====                                     ====
_prb_read_valid(seq)
cmpxchg_release(last_finalized_seq,seq)  seq=read_acquire(last_finalized_seq)
                                         _prb_read_valid(seq)

> IMHO, the barrier provided by the acquire() is _important_ to make
> sure that _prb_read_valid() would see the valid descriptor.

Correct.

> Now, I think that the related read_release(seq) is hidden in:
>
> static int prb_read(struct printk_ringbuffer *rb, u64 seq,
> 		    struct printk_record *r, unsigned int *line_count)
> {
> 	/* Get a local copy of the correct descriptor (if available). */
> 	err = desc_read_finalized_seq(desc_ring, id, seq, &desc);
>
> 	/* If requested, copy meta data. */
> 	if (r->info)
> 		memcpy(r->info, info, sizeof(*(r->info)));
>
> 	/* Copy text data. If it fails, this is a data-less record. */
> 	if (!copy_data(&rb->text_data_ring, &desc.text_blk_lpos, info->text_len,
> 		       r->text_buf, r->text_buf_size, line_count)) {
> 		return -ENOENT;
> 	}
>
> 	/* Ensure the record is still finalized and has the same @seq. */
> 	return desc_read_finalized_seq(desc_ring, id, seq, &desc);
> 	       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 	       |
> 	       `-> This includes a memory barrier /* LMM(desc_read:A) */
> 		   which makes sure that the data are read before
> 		   the desc/data could be reused.
> }
>
> I consider this /* LMM(desc_read:A) */ as a counter part for that
> acquire() in prb_next_seq().

desc_read:A is not a memory barrier. It only marks the load of the
descriptor state. This is a significant load because prb_next_seq() must
see at least the descriptor state that desc_update_last_finalized() saw.

The memory barrier comments in desc_update_last_finalized() state:

    * If desc_last_finalized_seq:A reads from
    * desc_update_last_finalized:A, then desc_read:A reads from
    * _prb_commit:B.

This is referring to a slightly different situation than the example I
used above because it is referencing where the descriptor state was
stored (_prb_commit:B). The same general picture is valid:

CPU0                              CPU1
====                              ====
_prb_commit:B
desc_update_last_finalized:A      desc_last_finalized_seq:A
                                  desc_read:A

desc_read:A is loding the descriptor state that _prb_commit:B stored.

The extra note in the comment clarifies that _prb_commit:B could also be
denoted as desc_read:A because desc_update_last_finalized() performs a
read (i.e. must have seen) _prb_commit:B.

    * Note: _prb_commit:B and desc_update_last_finalized:A can be
    *       different CPUs. However, the desc_update_last_finalized:A
    *       CPU (which performs the release) must have previously seen
    *       _prb_commit:B.

Normally the CPU committing the record will also update
last_finalized_seq. But it is possible that another CPU updates
last_finalized_seq before the committing CPU because it already sees the
finalized record. In that case the complete (maximally complex) picture
looks like this.

CPU0            CPU1                           CPU2
====            ====                           ====
_prb_commit:B   desc_read:A
                desc_update_last_finalized:A   desc_last_finalized_seq:A
                                               desc_read:A

Any memory barriers in _prb_commit() or desc_read() are irrelevant for
guaranteeing that a CPU reading a sequence value from
desc_last_finalized_seq() will always be able to read that record.

> Summary:
>
> I saw atomic_long_read_acquire(last_finalized_seq) called from
> prb_next_seq() code path. The barrier looked important to me.
> But I saw neither the counter-part nor any comment. I wanted
> to understand it because it might be important for reviewing
> following patches which depend on prb_next_seq().

desc_update_last_finalized:A is the counterpart to
desc_last_finalized_seq:A. IMHO there are plenty of comments that are
formally documenting these memory barriers. Including the new entry in
the summary of all memory barriers:

 *   desc_update_last_finalized:A / desc_last_finalized_seq:A
 *     store finalized record, then set new highest finalized sequence number

John