lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zqush2SkFQpYxJ7q@pathway.suse.cz>
Date: Thu, 1 Aug 2024 17:40:55 +0200
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org
Subject: Re: preffer_ofload param: was: Re: [PATCH printk v3 11/19] printk:
 nbcon: Rely on kthreads for normal operation

On Thu 2024-08-01 16:28:28, John Ogness wrote:
> On 2024-07-31, Petr Mladek <pmladek@...e.com> wrote:
> >> @@ -1511,10 +1511,10 @@ static void nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq,
> >>  
> >>  	/*
> >>  	 * If flushing was successful but more records are available, this
> >> -	 * context must flush those remaining records because there is no
> >> -	 * other context that will do it.
> >> +	 * context must flush those remaining records if the printer thread
> >> +	 * is not available do it.
> >>  	 */
> >> -	printk_get_console_flush_type(&ft, false);
> >> +	printk_get_console_flush_type(&ft, true);
> >
> > Hmm, it is a bit weird that we change the value even though it does
> > not affect the behavior. The parameter @prefer_offload affects only
> > the legacy loop.
> 
> For nbcon consoles, prefer_offload is really only annotating the
> intentions. In this case, nbcon_atomic_flush_pending_con() is interested
> in knowing if kthread printer is available, thus using
> prefer_offload=true.
> 
> If the query yields ft.nbcon_atomic set, it means that the kthread
> printer is _not_ available (nbcon_atomic and nbcon_offload are
> exclusive) and it can (and must) handle its flushing responsibilities
> directly (immediately, using the atomic callbacks).
> 
> You might ask, why does it not check ft.nbcon_offload? Although that
> would tell it that the kthread is not available, it does not imply that
> atomic printing is allowed. In order to see if atomic printing is
> allowed, it would need to check ft.nbcon_atomic. And since the two are
> exclusive, really it is enough just to check ft.nbcon_atomic. If
> ft.nbcon_atomic is not set, either the kthread is available or there is
> nothing the nbcon console can do about it anyway (for example, it must
> rely on the legacy loop to handle it).

Where exactly do you need prefer offload of the legacy consoles?

Do need to "prefer offload" or "force offload" in these situations?

Note: We are talking about "legacy consoles" and "legacy approach"
which is:

   Legacy approach: "Prefer direct flush when possible"

Also note that "force_offload" is usually detected automatically via
the context: is_printk_deferred() check.

IMHO, we need to explicitely "force_offload" only in printk_deferred()
where it is passed to vprintk_emit() via LOGLEVEL_SCHED.

IMHO, we could get rid of this hack and simply do something like:

  int vprintk_deferred(const char *fmt, va_list args)
{
	preemption_disable();
	printk_deferred_enter();

	return vprintk_emit(0, LOGLEVEL_DEFAULT, NULL, fmt, args);

	printk_deferred_exit();
	preemption_enable();
}

> I suppose it is wrong that nbcon_atomic_flush_pending_con(false) will
> set ft.nbcon_offload if the kthreads are available. I would fix that.

This will not work in vprintk_emit(). We need to use there

   nbcon_atomic_flush_pending_con(false)

because legacy consoles should be handled directly when possible
(legacy approach).

But nbcon consoles should be offloaded to the kthread when possible
(new approach).

The new approach is acceptable "only" for nbcon consoles because
they are synchronized by the nbcon context. Namely, they allow
to take over the ownership and flush the messages directly
in emergency or panic context.

In terms of approach:

  nbcon approach: "Prefer offload when possible"

=> for nbcon consoles we would need "prefer_direct" or
   "force_direct" parameter.

> > IMHO, __wake_up_klogd() is the only location where we really need
> > to know if there are any messages for the legacy loop, for example,
> > when called from printk_deferred().
> >
> > It should not be needed in other situations because it there
> > is always __pr_flush() or console_unlock() which would flush
> > the legacy consoles directly anyway.
> >
> > => I suggest to
> >
> > 1. Remove @prefer_offload parameter from printk_get_console_flush_type
> >
> > 2. Update __wake_up_klogd() to check both ft.legacy_offload and
> >    ft.legacy_direct, like:
> >
> > 	printk_get_console_flush_type(&ft);
> > 	if (!ft.legacy_offload && !ft.legacy_direct)
> > 		val &= ~PRINTK_PENDING_OUTPUT;
> >
> >
> > NOTE: I actually suggested to use in vprintk_emit():
> >
> > 	printk_get_console_flush_type(&ft, deffer_legacy);
> >
> >       But it can be done even without this parameter. Like it
> >       is done in this version of the patchset.
> 
> I understand what you are saying. But I don't like it. :-)
> 
> It would mean that functions only interested in offloading must check
> direct. And that direct and offload are no longer exclusive. IMHO this
> is a hack. The whole point of printk_get_console_flush_type() is so that
> the flusher does not need any special code to figure out what to do.

I agree.

> If the flusher is only interested in offloaded flushing capabilities, it
> should be able to query that. We could wrap things into macros to make
> it clearer, but it might get a bit ugly with the efficience (depending
> on how well compilers can optimize the macro usage):
> 
> #define is_nbcon_offload_available() ({			\
> 	struct console_flush_type ft;			\
> 	printk_get_console_flush_type(&ft, true);	\
> 	ft.nbcon_offload;				\
> })
> 
> #define is_nbcon_atomic_available() ({			\
> 	struct console_flush_type ft;			\
> 	printk_get_console_flush_type(&ft, false);	\
> 	ft.nbcon_atomic;				\
> })
> 
> And then this code looks like:
> 
> if (prb_read_valid(prb, nbcon_seq_read(con), NULL)) &&
>     !is_nbcon_offload_available() &&
>     is_nbcon_atomic_available()) {
> 	/* flush directly using atomic callback */
> }

This is crazy. But where exactly do you need this?


> I have backported the printk_get_console_flush_type() macro to the
> atomic series for v7. I would like to keep @prefer_offload and I will
> try to add comments to clarify why the different query types are used.

Please, reconsider this.

I believe that the parameter "prefer_offload" adds more harm than good
because:

   + It is a non-sense for nbcon consoles. They always prefer offload
     except for emergency and panic. But emergency and panic is
     handled transparently by nbcon_get_default_prio().

   + It is confusing even for legacy consoles after introducing the
     kthread. There will be three types of offload:

	+ do console_lock()/unlock() in IRQ work
	+ wake kthread
	+ wake kthread in IRQ work

In fact, the meaning is rather "can_t_call_scheduler_code_a_safe_way".

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ