lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZS5vrte2OZXcIc9L@alley>
Date:   Tue, 17 Oct 2023 13:27:42 +0200
From:   Petr Mladek <pmladek@...e.com>
To:     John Ogness <john.ogness@...utronix.de>
Cc:     Sergey Senozhatsky <senozhatsky@...omium.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH printk v2 3/4] printk: Skip unfinalized records in panic

On Fri 2023-10-13 22:49:39, John Ogness wrote:
> Normally a reader will stop once reaching an unfinalized record.
> However, when a panic happens, writers from other CPUs (or an
> interrupted context on the panic CPU) may have been writing a
> record and were unable to finalize it. The panic CPU will
> reserve/commit/finalize its panic records, but these will be
> located after the unfinalized records. This results in panic()
> not flushing the panic messages.
> 
> Add a special case to printk_get_next_message() to allow
> skipping over unfinalized records if on the panic CPU.
> 
> Also refine the documentation of the ringbuffer reading
> functions to clarify that failure may also be due to an
> unfinalized record.

Great catch!

It took me some time to think about various possible scenarios.
As a result, I suggest to update some comments, see below.

> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2813,8 +2813,19 @@ static bool printk_get_next_message(struct printk_message *pmsg, u64 seq,
>  	else
>  		prb_rec_init_rd(&r, &info, outbuf, outbuf_sz);
>  
> -	if (!prb_read_valid(prb, seq, &r))
> -		return false;
> +	while (!prb_read_valid(prb, seq, &r)) {
> +		if (this_cpu_in_panic() && seq < prb_next_seq(prb)) {
> +			/*
> +			 * The record @seq is not finalized and there may be

"may be" is a bit misleading. If I count it correctly then there
"are" more records when seq < prb_next_seq().

> +			 * more records in the ringbuffer. Since this is the
> +			 * panic CPU, skip over the unfinalized record and
> +			 * try to read a finalized record that may follow.
> +			 */

It took me quite some time to understand whether this is right or not.
It is partly because the comment describes what the code does but it
does not describe the motivation and why it is safe. I wonder
if the following is better:

			/*
			 * Skip non-finalized records when emitting messages
			 * from panic CPU. They might never get finalized.
			 * Note that new messages printed on panic CPU
			 * are finalized when we are here.
			 */

But wait. Are the messages printed in panic context always finalized?
What about messages without the trailing newline printed?

Aha, they actually are finalized because prb_next_seq() would return sequence
number of the record in "desc_committed" state when there is
a message without newline and we skip only seq < prb_next_seq().
So I would update the comment, something like:

			/*
			 * Skip non-finalized records when emitting messages
			 * from panic CPU. They might never get	finalized.
			 *
			 * Note that new messages printed on panic CPU are
			 * finalized when we are here. The only exception
			 * might be the last message without trailing newline.
			 * But it would have the sequence number returned
			 * by prb_next_seq().
			 */

Sigh, it is long. But this is a quite tricky situation.

> +			seq++;
> +		} else {
> +			return false;
> +		}
> +	}
>  
>  	pmsg->seq = r.info->seq;
>  	pmsg->dropped = r.info->seq - seq;
> diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
> index 2dc4d5a1f1ff..1bbc008109ef 100644
> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> @@ -1876,8 +1876,9 @@ static u64 prb_first_seq(struct printk_ringbuffer *rb)
>  }
>  
>  /*
> - * Non-blocking read of a record. Updates @seq to the last finalized record
> - * (which may have no data available).
> + * Non-blocking read of a record. Updates @seq to the record that was read
> + * (which may have no data available) or was attempted to be read (in case
> + * it was unfinalized or non-existent).

I am confused. Well, even the original message was confusing.
I think about describing it the following way.

  * On input, @seq defines the record which should be read. It might
  * be updated to a higher value when the requested record has already
  * been overwritten or when the record had empty data.
  *
  * On return, @seq value depends on the situation. It is:
  *
  *	- sequence number of the read record on success.
  *     - sequence number of the first found to-be-finalized record
  *	  when the input seq number was lower or equal to prb_next_seq().
  *	- the original value when @seq was invalid, bigger then prb_next_seq().

Sigh, the comment is hairy. Maybe you would find a more elegant way
to describe the variants.


BTW: The support for data-less records were added by the commit
     ("printk: ringbuffer: support dataless records"). It was
     needed to handle empty lines: printk("\n"). It is strange
     that we skip them instead of printing the empty line.

     It looks like a small bug. I might look a it if you agree
     that it looks like a bug.

>   *
>   * See the description of prb_read_valid() and prb_read_valid_info()
>   * for details.


Also I would update the above prb_next_seq():

--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -2012,8 +2012,8 @@ u64 prb_first_valid_seq(struct printk_ringbuffer *rb)
  * available records should be skipped.
  *
  * Context: Any context.
- * Return: The sequence number of the next newest (not yet available) record
- *         for readers.
+ * Return: The sequence number of the next newest (not yet finalized or
+ *	   non-existing) record for readers.
  */
 u64 prb_next_seq(struct printk_ringbuffer *rb)
 {

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ