linux-kernel - Re: [RFC][PATCH 2/4] printk: offload printing from wake_up_klogd_work

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170320160928.GT3977@pathway.suse.cz>
Date:   Mon, 20 Mar 2017 17:09:28 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Cc:     Steven Rostedt <rostedt@...dmis.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        "Rafael J . Wysocki" <rjw@...ysocki.net>,
        linux-kernel@...r.kernel.org,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Subject: Re: [RFC][PATCH 2/4] printk: offload printing from
 wake_up_klogd_work_func()

On Sat 2017-03-18 18:57:39, Sergey Senozhatsky wrote:
> Hello Petr,
> 
> On (03/17/17 13:19), Petr Mladek wrote:
> [..]
> > A solution might be to rename the variable to something like
> > printk_pending_output, always set it in vprintk_emit() and
> > clear it in console_unlock() when there are no pending messages.
> 
> believe it or not, I thought that I set printk_kthread_need_flush_console
> to `true' unconditionally. probably I did so in one of the previous patch
> set iterations. weird. I agree that doing this makes sense. thanks for
> bringing this up.
> 
> ....
> 
> I don't want that printk_kthread_need_flush_console to exist. instead,
> I think, I want to move printk_pending out of per-cpu memory and use a
> global printk_pending. set PRINTK_PENDING_OUTPUT bit to true in
> vprintk_emit(), clear it in console_unlock(). and make both printk_kthread
> scheduling condition and console_unlock() retry path depend on
> `printk_pending == 0' being true.

I like the idea. The things closely related.
 
> something like below (the code is ugly and lacks a ton of barriers, etc.
> etc.)

Sigh, I wanted to add few comments and it got me deeper than I wanted.

I am sorry if some of my comments are obvious. I know that the
patch was a draft and you probably was aware of many of the
problems.

Anyway, it might make sense to do the change in more steps.
One thing is removing the per-cpu variable. Another thing is changing
the logic and setting/clearing the state variable in different
situation. Yet another thing is adding the support for
kthread, etc. I am afraid that it might be a patchset on its own.


> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 601a9ef6db89..a0b231f49052 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -439,8 +439,6 @@ static char *log_buf = __log_buf;
>  static u32 log_buf_len = __LOG_BUF_LEN;
>  
>  static struct task_struct *printk_kthread __read_mostly;
> -/* When `true' printing thread has messages to print. */
> -static bool printk_kthread_need_flush_console;
>  /*
>   * We can't call into the scheduler (wake_up() printk kthread), for example,
>   * during suspend/kexec. This temporarily switches printk to old behaviour.
> @@ -451,6 +449,13 @@ static int printk_kthread_disable __read_mostly;
>   * it doesn't go back to 0.
>   */
>  static bool printk_emergency __read_mostly;
> +/*
> + * Delayed printk version, for scheduler-internal messages:
> + */
> +#define PRINTK_PENDING_WAKEUP	0x01
> +#define PRINTK_PENDING_OUTPUT	0x02
> +
> +static int printk_pending = 0;

Something tells me that we need to use atomic_t. Otherwise, we could
not safely manipulate the bits withtout a lock.

Alternative solution would be to use two separate variables.
This might make the code easier to read. I think that they
were combined only to safe space in the per-CPU area.


>  static inline bool printk_kthread_enabled(void)
>  {
> @@ -1806,6 +1811,7 @@ asmlinkage int vprintk_emit(int facility, int level,
>  
>  	printed_len += log_output(facility, level, lflags, dict, dictlen, text, text_len);
>  
> +	printk_pending |= PRINTK_PENDING_OUTPUT;
>  	logbuf_unlock_irqrestore(flags);
>
>  	/* If called from the scheduler, we can not call up(). */
> @@ -1819,8 +1825,6 @@ asmlinkage int vprintk_emit(int facility, int level,
>  		 * schedulable context.
>  		 */
>  		if (printk_kthread_enabled()) {
> -			printk_kthread_need_flush_console = true;
> -
>  			printk_safe_enter_irqsave(flags);
>  			wake_up_process(printk_kthread);
>  			printk_safe_exit_irqrestore(flags);
> @@ -2220,10 +2224,11 @@ void console_unlock(void)
>  	static char text[LOG_LINE_MAX + PREFIX_MAX];
>  	static u64 seen_seq;
>  	unsigned long flags;
> -	bool wake_klogd = false;
> -	bool do_cond_resched, retry;
> +	bool wake_klogd;
> +	bool do_cond_resched, retry = false;
>  
>  	if (console_suspended) {
> +		printk_pending &= ~PRINTK_PENDING_OUTPUT;

Hmm, this is pretty non-intuitive. I guess that it is needed to
avoid a busy cycle in the printk kthread?

I can't find a better solution. We should at least use a name
that makes more sense, e.g. PRINTK_POKE_CONSOLE.

>  		up_console_sem();
>  		return;
>  	}
> @@ -2242,6 +2247,8 @@ void console_unlock(void)
>  	console_may_schedule = 0;
>  
>  again:
> +	wake_klogd = printk_pending & PRINTK_PENDING_WAKEUP;
> +	printk_pending = 0;

This might be racy. PRINTK_PENDING_WAKEUP is set without
a lock in bust_spinlocks() via wake_up_klogd(). The above
code read and clears the state non-atomically.


>  	/*
>  	 * We released the console_sem lock, so we need to recheck if
>  	 * cpu is online and (if not) is there at least one CON_ANYTIME
> @@ -2330,15 +2337,16 @@ void console_unlock(void)
>  	 * flush, no worries.
>  	 */
>  	raw_spin_lock(&logbuf_lock);
> -	retry = console_seq != log_next_seq;
> +	if (printk_pending != 0 || console_seq != log_next_seq)

printk_pending != 0 also when PRINTK_PENDING_WAKEUP is set.
I would do it the other way. I would clear PRINTK_PENDING_OUTPUT
when console_seq == log_next_seq and keep the check as is here.

> +		retry = true;
>  	raw_spin_unlock(&logbuf_lock);
>  	printk_safe_exit_irqrestore(flags);
>  
> -	if (retry && console_trylock())
> -		goto again;
> -
>  	if (wake_klogd)
>  		wake_up_klogd();
> +
> +	if (retry && console_trylock())
> +		goto again;

Why do you actually modify the logic for klogd()?
It might make sense but it is questionable. For example,
klogd() will need logbuf_lock as well. It might fight over
it with the console when the again target is used.
I would do it in separate patch and probably not
in this patchset.


>  }
>  EXPORT_SYMBOL(console_unlock);
>  
> @@ -2722,19 +2730,9 @@ static int __init printk_late_init(void)
>  late_initcall(printk_late_init);
>  
>  #if defined CONFIG_PRINTK
> -/*
> - * Delayed printk version, for scheduler-internal messages:
> - */
> -#define PRINTK_PENDING_WAKEUP	0x01
> -#define PRINTK_PENDING_OUTPUT	0x02
> -
> -static DEFINE_PER_CPU(int, printk_pending);

BTW: wake_up_klogd_work does not need to be per-CPU as well.
irq_work infrastructure heavily uses per-CPU variables.
But a global struct irq_work is safe, see irq_work_claim().

> 
> 
> [..]
> > If I remember correctly, you were not much happy with this
> > solution because it did spread the logic. I think that you did not
> > believe that it was worth fixing the second problem.
> 
> hm, I think Jan Kara was the first one who said that we
> are overcomplicating the whole thing... or may be it was me.
> don't deny it either.

I do not remember as well :-) Anyway, it really looks more
complicated than I thought.

I think that some clean up and optimization of the printk_pending
stuff is needed and worth it. I am just not sure whether to do it
before or after the printk kthread patchset.

I would slightly prefer to clean the printk_pending stuff first.
It might delay printk kthread patchset a bit but it will be cleaner.

Best Regards,
Petr