linux-kernel - Re: [PATCH printk v3 09/14] printk: Wait for all reserved records with pr

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87y1bznii8.fsf@jogness.linutronix.de>
Date: Mon, 05 Feb 2024 14:39:03 +0106
From: John Ogness <john.ogness@...utronix.de>
To: Petr Mladek <pmladek@...e.com>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>, Steven Rostedt
 <rostedt@...dmis.org>, Thomas Gleixner <tglx@...utronix.de>,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH printk v3 09/14] printk: Wait for all reserved records
 with pr_flush()

On 2024-01-31, Petr Mladek <pmladek@...e.com> wrote:
>> +	last_finalized_seq = desc_last_finalized_seq(rb);
>> +
>> +	/*
>> +	 * @head_id is loaded after @last_finalized_seq to ensure that it is
>> +	 * at or beyond @last_finalized_seq.
>> +	 *
>> +	 * Memory barrier involvement:
>> +	 *
>> +	 * If desc_last_finalized_seq:A reads from
>> +	 * desc_update_last_finalized:A, then
>> +	 * prb_next_reserve_seq:A reads from desc_reserve:D.
>> +	 *
>> +	 * Relies on:
>> +	 *
>> +	 * RELEASE from desc_reserve:D to desc_update_last_finalized:A
>> +	 *    matching
>> +	 * ACQUIRE from desc_last_finalized_seq:A to prb_next_reserve_seq:A
>> +	 *
>> +	 * Note: desc_reserve:D and desc_update_last_finalized:A can be
>> +	 *       different CPUs. However, the desc_update_last_finalized:A CPU
>> +	 *       (which performs the release) must have previously seen
>> +	 *       desc_read:C, which implies desc_reserve:D can be seen.
>
> The most tricky part is desc_reserve:D. It is a supper complicated
> barrier which guarantees many things. But I think that there are
> many more write barriers before the allocated descriptor reaches
> finalized state. So that it should be easier to prove the correctness
> here by other easier barriers.

Yes, desc_reserve:D provides memory barriers for several orderings. But
it is _not_ providing a memory barrier for this ordering. It only marks
where @head_id is stored.

The only memory barriers involved here are desc_update_last_finalized:A
and its counterpart desc_last_finalized_seq:A.

CPU0                                 CPU1
====                                 ====
store(head_id)
store_release(last_finalized_seq)    load_acquire(last_finalized_seq)
                                     load(head_id)

> To make it clear. I am all for keeping the above precise description
> as is. I just think about adding one more human friendly note which
> might help future generations to understand the correctness an easier
> way.  Something like:
>
> 	* Note: The above description might be hard to follow because
> 	*	desc_reserve:D is a multi-purpose barrier. But this is
> 	*	just the first barrier which makes sure that @head_id
> 	*	is updated before the reserved descriptor gets finalized
> 	*	and updates @last_finalized_seq.
> 	*
> 	*	There are more release barriers in between, especially,
> 	*	desc_reserve:F and desc_update_last_finalized:A. Also these make
> 	*	sure that the desc_last_finalized_seq:A acquire barrier allows
> 	*	to read @head_id related to @last_finalized_seq or newer.

Non-global memory barriers must operate on the same memory. In this
case, the acquire/release memory barriers are operating on
@last_finalized_seq. The other ordered memory load in this situation is
for @head_id, so it makes sense to specify the store of @head_id as the
start of the release block.

> In fact, the desc_update_last_finalized:A release and
> desc_last_finalized_seq:A acquire barriers are enough to prove
> that we read here @head_id related to @last_finalized_seq or newer.

Yes, which is why they are listed here. ;-)

> Or maybe it is exactly what you described and my "human friendly"
> description is too naive. I am still not completely familiar
> with the "Memory barrier involvement:" and "Relies on:"
> format.

Yes, the format takes some getting used to. I thank Andrea Parri for
helping me to understand memory barriers and contributing ideas to
formalize the documentation.

>> +	if (err == -EINVAL) {
>> +		if (last_finalized_seq == 0) {
>> +			/*
>> +			 * @last_finalized_seq still contains its initial
>> +			 * value. Probably no record has been finalized yet.
>> +			 * This means the ringbuffer is not yet full and the
>> +			 * @head_id value can be used directly (subtracting
>> +			 * off the id value corresponding to seq=0).
>> +			 */
>> +
>> +			/*
>> +			 * Because of hack#2 of the bootstrapping phase, the
>> +			 * @head_id initial value must be handled separately.
>> +			 */
>> +			if (head_id == DESC0_ID(desc_ring->count_bits))
>> +				return 0;
>> +
>> +			/*
>> +			 * The @head_id is initialized such that the first
>> +			 * increment will yield the first record (seq=0).
>> +			 * Therefore use the initial value +1 as the base to
>> +			 * subtract from @head_id.
>> +			 */
>> +			last_finalized_id = DESC0_ID(desc_ring->count_bits) + 1;
>
> It took me long time to understand the above code and comments. I
> wonder if the following looks easier even for you:
>
> 	if (err == -EINVAL) {
> 		if (last_finalized_seq == 0) {
> 			/*
> 			 * No record has been finalized or even	reserved yet.
> 			 *
> 			 * The @head_id is initialized such that the first
> 			 * increment will yield the first record (seq=0).
> 			 * Handle it separately to avoid a negative @diff below.
> 			 */
> 			if (head_id == DESC0_ID(desc_ring->count_bits))
> 				return 0;
>
> 			/* One or more descriptors are already reserved. Use
> 			 * the descriptor ID of the first one (@seq=0) for
> 			 * the @diff below.
> 			 */
> 			last_finalized_id = DESC0_ID(desc_ring->count_bits) + 1;

I will use your comments for the next version.

>> +	/*
>> +	 * @diff is the number of records beyond the last record available
>> +	 * to readers.
>> +	 */
>
> This is kind of obvious from the code. Also it is not true when the
> first record has not been finalized yet. The following might
> be more useful:
>
> 	/* Diff of known descriptor IDs to compure releted sequence nubmers. */
>
>> +	diff = head_id - last_finalized_id;

I will use your comments for the next version.

> BTW: It came to my mind whether the logic might be more
>      straightforward if we store desc_ring->next_finalized_seq instead
>      of @last_finalized_seq.

I thought about this as well. But I felt like the meaning was a bit
confusing. Also @head_id points to the latest reserved descriptor, so it
makes the code a bit easier to follow when creating a diff based on the
latest finalized descriptor.

>      Even the initial value vould be valid. Also the value is
>      used only in prb_next_reserve_seq() and prb_next_seq() where
>      we need to start with this value anyway.

Agreed. I just didn't like how the code looked, even though there were
less +1 operations. It was hard(er) to follow.

>      Note that prb_next_seq() actually does not need to try _prb_read_valid()
>      for @last_finalized_seq. It should always be valid unless
>      the record has been already overwritten. In which case,
>      there should be a newer readable record.

Correct. The current code increments before reading.

> Reviewed-by: Petr Mladek <pmladek@...e.com>
>
> Well, it would be nice to update the comments if you liked the
> proposals.

Thanks. I will use your comments.

John