linux-kernel - Re: [PATCH printk v1 06/18] printk: nobkl: Add acquire/release logic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZA9KUsaYuOt8qSf4@alley>
Date:   Mon, 13 Mar 2023 17:07:46 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     John Ogness <john.ogness@...utronix.de>
Cc:     Sergey Senozhatsky <senozhatsky@...omium.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [PATCH printk v1 06/18] printk: nobkl: Add acquire/release logic

On Thu 2023-03-02 21:02:06, John Ogness wrote:
> From: Thomas Gleixner <tglx@...utronix.de>
> 
> Add per console acquire/release functionality. The console 'locked'
> state is a combination of several state fields:
> 
>   - The 'locked' bit
> 
>   - The 'cpu' field that denotes on which CPU the console is locked
> 
>   - The 'cur_prio' field that contains the severity of the printk
>     context that owns the console. This field is used for decisions
>     whether to attempt friendly handovers and also prevents takeovers
>     from a less severe context, e.g. to protect the panic CPU.
> 
> The acquire mechanism comes with several flavours:
> 
>   - Straight forward acquire when the console is not contended
> 
>   - Friendly handover mechanism based on a request/grant handshake
> 
>     The requesting context:
> 
>       1) Puts the desired handover state (CPU nr, prio) into a
>          separate handover state
> 
>       2) Sets the 'req_prio' field in the real console state
> 
>       3) Waits (with a timeout) for the owning context to handover
> 
>     The owning context:
> 
>       1) Observes the 'req_prio' field set
> 
>       2) Hands the console over to the requesting context by
>          switching the console state to the handover state that was
>          provided by the requester
> 
>   - Hostile takeover
> 
>       The new owner takes the console over without handshake
> 
>       This is required when friendly handovers are not possible,
>       i.e. the higher priority context interrupted the owning context
>       on the same CPU or the owning context is not able to make
>       progress on a remote CPU.
> 
> The release is the counterpart which either releases the console
> directly or hands it gracefully over to a requester.
> 
> All operations on console::atomic_state[CUR|REQ] are atomic
> cmpxchg based to handle concurrency.
> 
> The acquire/release functions implement only minimal policies:
> 
>   - Preference for higher priority contexts
>   - Protection of the panic CPU
> 
> All other policy decisions have to be made at the call sites.
> 
> The design allows to implement the well known:
> 
>     acquire()
>     output_one_line()
>     release()
> 
> algorithm, but also allows to avoid the per line acquire/release for
> e.g. panic situations by doing the acquire once and then relying on
> the panic CPU protection for the rest.
> 
> --- a/include/linux/console.h
> +++ b/include/linux/console.h
> @@ -189,12 +201,79 @@ struct cons_state {
>  			union {
>  				u32	bits;
>  				struct {
> +					u32 locked	:  1;
> +					u32 unsafe	:  1;
> +					u32 cur_prio	:  2;
> +					u32 req_prio	:  2;
> +					u32 cpu		: 18;
>  				};
>  			};
>  		};
>  	};
>  };
>  
> +/**
> + * cons_prio - console writer priority for NOBKL consoles
> + * @CONS_PRIO_NONE:		Unused
> + * @CONS_PRIO_NORMAL:		Regular printk
> + * @CONS_PRIO_EMERGENCY:	Emergency output (WARN/OOPS...)
> + * @CONS_PRIO_PANIC:		Panic output
> + *
> + * Emergency output can carefully takeover the console even without consent
> + * of the owner, ideally only when @cons_state::unsafe is not set. Panic
> + * output can ignore the unsafe flag as a last resort. If panic output is
> + * active no takeover is possible until the panic output releases the
> + * console.
> + */
> +enum cons_prio {
> +	CONS_PRIO_NONE = 0,
> +	CONS_PRIO_NORMAL,
> +	CONS_PRIO_EMERGENCY,
> +	CONS_PRIO_PANIC,
> +};
> +
> +struct console;
> +
> +/**
> + * struct cons_context - Context for console acquire/release
> + * @console:		The associated console
> + * @state:		The state at acquire time
> + * @old_state:		The old state when try_acquire() failed for analysis
> + *			by the caller
> + * @hov_state:		The handover state for spin and cleanup
> + * @req_state:		The request state for spin and cleanup
> + * @spinwait_max_us:	Limit for spinwait acquire
> + * @prio:		Priority of the context
> + * @hostile:		Hostile takeover requested. Cleared on normal
> + *			acquire or friendly handover
> + * @spinwait:		Spinwait on acquire if possible
> + */
> +struct cons_context {
> +	struct console		*console;
> +	struct cons_state	state;
> +	struct cons_state	old_state;
> +	struct cons_state	hov_state;
> +	struct cons_state	req_state;

This looks quite complicated. I am still trying to understand the logic.

I want to be sure that we are on the same page. Let me try to
summarize my understanding and expectations:

1. The console has two state variables (atomic_state[2]):
       + CUR == state of the current owner
       + REQ == set when anyone else requests to take over the owner ship

   In addition, there are also priority bits in the state variable.
   Each state variable has cur_prio, req_prio.


2. There are 4 priorities. They describe the type of the context that is
   either owning the console or which would like to get the owner
   ship.

   These priorities have the following meaning:

       + NONE: when the console is idle

       + NORMAL: the console is owned by the kthread

       + EMERGENCY: The console is called directly from printk().
	   It is used when printing some emergency messages, like
	   WARN(), watchdog splat.

       + PANIC: when console is called directly but only from
	  the CPU that is handling panic().


3. The number of contexts:

       + The is one NORMAL context used by the kthread.

       + There might be eventually more EMERGENCY contexts running
	 in parallel. Usually there is only one but other CPUs
	 might still add more messages into the log buffer parallel.

	 The EMERGENCY context owner is responsible for flushing
	 all pending messages.

       + The might be only one PANIC context on the panic CPU.


4. The current owner sets a flag "unsafe" when it is not safe
   to take over the lock a hostile way.


Switching context:

We have a context with a well defined priority which tries
to get the ownership. There are few possibilities:

a) The console is idle and the context could get the ownership
   immediately.

   It is a simple cmpxchg of con->atomic_state[CUR].


b) The console is owned by anyone with a lower priority.
   The new caller should try to take over the lock a safe way
   when possible.

   It can be done by setting con->atomic_state[REQ] and waiting
   until the current owner makes him the owner in
   con->atomic_state[CUR].


c) The console is owned by anyone with a lower priority
   on the same CPU. Or the owner on an other CPU did not
   pass the lock withing the timeout.

   In this case, we could steal the lock. It can be done by
   writing to con->atomic_state[CUR].

   We could do this in EMERGENCY or PANIC context when the current
   owner is not in an "unsafe" context.

   We could do this at the end of panic (console_flush_in_panic())
   even when the current owner is in an "unsafe" context.


Common rule: The caller never tries to take over the lock
    from another owner of the same priority (?)


Current owner:

  + Must always do non-atomic operations in the "unsafe" context.

  + Must check if they still own the lock or if there is a request
    to pass the lock before manipulating the console state or reading
    the shared buffers.

  + Should pass the lock to a context with a higher priority.
    It must be done only in a "safe" state. But it might be in
    the middle of the record.


Passing the owner:

   + The current owner sets con->atomic_state[CUR] according
     to the info in con->atomic_state[REQ] and bails out.

   + The notices that it became the owner by finding its
     requested state in con->atomic_state[CUR]

   + The most tricky situation is when the current owner
     is passing the lock and the waiter is giving up
     because of the timeout. The current owner could pass
     the lock only when the waiter is still watching.



Other:

   + Atomic consoles ignore con->seq. Instead they store the lower
     32-bit part of the sequence number in the atomic_state variable
     at least on 64-bit systems. They use get_next_seq() to guess
     the higher 32-bit part of the sequence number.


Questions:

How exactly do we handle the early boot before kthreads are ready,
please? It looks like we just wait for the kthread. This looks
wrong to me.

Does the above summary describe the behavior, please?
Or does the code handle some situation another way?

> +	unsigned int		spinwait_max_us;
> +	enum cons_prio		prio;
> +	unsigned int		hostile		: 1;
> +	unsigned int		spinwait	: 1;
> +};
> +
> --- a/kernel/printk/printk_nobkl.c
> +++ b/kernel/printk/printk_nobkl.c
> +/**
> + * cons_check_panic - Check whether a remote CPU is in panic
> + *
> + * Returns: True if a remote CPU is in panic, false otherwise.
> + */
> +static inline bool cons_check_panic(void)
> +{
> +	unsigned int pcpu = atomic_read(&panic_cpu);
> +
> +	return pcpu != PANIC_CPU_INVALID && pcpu != smp_processor_id();
> +}

This does the same as abandon_console_lock_in_panic(). I would
give it some more meaningful name and use it everywhere.

What about other_cpu_in_panic() or panic_on_other_cpu()?

Best Regards,
Petr