lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 10 May 2021 15:41:31 +0800
From:   luojiaxing <luojiaxing@...wei.com>
To:     Petr Mladek <pmladek@...e.com>
CC:     <sergey.senozhatsky@...il.com>, <rostedt@...dmis.org>,
        <john.ogness@...utronix.de>, <linux-kernel@...r.kernel.org>,
        "Sergey Senozhatsky" <senozhatsky@...omium.org>,
        <linuxarm@...wei.com>, <bobo.shaobowang@...wei.com>
Subject: Re: [PATCH] printk: stop spining waiter when console resume to flush
 prb


On 2021/5/6 21:39, Petr Mladek wrote:
> On Thu 2021-05-06 16:00:26, Luo Jiaxing wrote:
>> Some threads still call printk() for printing when resume_console() is
>> being executed. In practice, the printk() is executed for a period of time
>> and then returned. The duration is determined by the number of prints
>> cached in the prb during the suspend/resume process. At the same time,
>> resume_console() returns quickly.
> The last sentence is a bit misleading. resume_console() returns
> quickly only when @console_owner was passed to another process.
>
>
>> Base on owner/waiter machanism, the frist one who fail to lock console will
>> become waiter, and start spining. When current owner finish print one
>> informance, if a waiter is waitting, owner will give up and let waiter
>> become a new owner. New owner need to flush the whole prb unitl prb empty
>> or another new waiter come and take the job from him.
>>
>> So the first waiter after resume_console() will take seconds to help to
> It need not to be the first waiter. The console_lock owner might be passed
> several times.
>
> But you have a point. Many messages might get accumulated when the
> console was suspended and any console_owner might spend a long time
> processing them. resume_console() seems to be always called in
> preemptible context, so it is safe to process all messages here.
>
>
>> flush prb, but driver which call printk() may be bothered by this. New
>> a flag to mark resume flushing prb. When the console resume, before the
>> prb is empty, stop to set a new waiter temporarily.
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -287,6 +287,9 @@ EXPORT_SYMBOL(console_set_on_cmdline);
>>   /* Flag: console code may call schedule() */
>>   static int console_may_schedule;
>>   
>> +/* Flags: console flushing prb when resume */
>> +static atomic_t console_resume_flush_prb = ATOMIC_INIT(0);
>> +
>>   enum con_msg_format_flags {
>>   	MSG_FORMAT_DEFAULT	= 0,
>>   	MSG_FORMAT_SYSLOG	= (1 << 0),
>> @@ -1781,7 +1784,8 @@ static int console_trylock_spinning(void)
>>   	raw_spin_lock(&console_owner_lock);
>>   	owner = READ_ONCE(console_owner);
>>   	waiter = READ_ONCE(console_waiter);
>> -	if (!waiter && owner && owner != current) {
>> +	if (!waiter && owner && owner != current &&
>> +	    !atomic_read(&console_resume_flush_prb)) {
> atomic_set()/atomic_read() do not provide any memory barriers.
> IMHO, the atomic operations are not enough to serialize @console_owner
> and @console_resume_flush_prb manipulation.
>
> See below.
>
>>   		WRITE_ONCE(console_waiter, true);
>>   		spin = true;
>>   	}
>> @@ -2355,6 +2359,7 @@ void resume_console(void)
>>   	if (!console_suspend_enabled)
>>   		return;
>>   	down_console_sem();
>> +	atomic_set(&console_resume_flush_prb, 1);
>>   	console_suspended = 0;
>>   	console_unlock();
>>   }
>> @@ -2592,6 +2597,8 @@ void console_unlock(void)
>>   	raw_spin_unlock(&logbuf_lock);
>>   
>>   	up_console_sem();
>> +	if (atomic_read(&console_resume_flush_prb))
>> +		atomic_set(&console_resume_flush_prb, 0);
> This should be done under console_lock. Othwerwise,
> it is not serialized at all.
>
> Also there is one more return from console_unlock():
>
> 	if (!can_use_console()) {
> 		console_locked = 0;
> 		up_console_sem();
> 		return;
> 	}
>
> @console_resume_flush_prb must be cleared here as well.
> Otherwise, the next random console_unlock() caller will not
> be allowed to pass the console lock owner.
>
>
> OK, the above patch tries to tell console_trylock_spinning()
> that it should ignore console_owner even when set.
> @console_resume_flush_prb variable is set/read by different
> processes in parallel which makes it complicated.
>
> Instead, we should simply tell console_unlock() that it should not
> set console_owner in this case. The most strightforward
> way is to pass this via parameter.
>
> Such console_unlock() might be used even on another locations
> with preemptible context.
>
>
> What about the following patch?


Hi, Petr, I test your patch and I think it needs to make some 
modifications to fix the problem.


My test method is as follows:
Kernel thread A causes the console to enter suspend and then resume it 
10 seconds later.
Kernel thread B repeatedly invokes dev_info() for 15 seconds after the 
console suspend.

My expected result is that the call to dev_info() in kernel thread B 
does not block for a long time.

But the actual test result shows that kernel thread B is still blocked 
after console_resume.

I modified some of the code to meet expectations, Please check below.


>
>
> >From 574e844f512c9f450e64832f09cc389bc9915f83 Mon Sep 17 00:00:00 2001
> From: Petr Mladek <pmladek@...e.com>
> Date: Thu, 6 May 2021 12:40:56 +0200
> Subject: [PATCH] printk: Prevent softlockup when resuming console
>
> Many printk messages might get accumulated when consoles were suspended.
> They are proceed when console_unlock() is called in resume_console().
>
> The possibility to pass the console lock owner was added to reduce the risk
> of softlockup when too many messages were handled in an atomic context.
>
> Now, resume_console() is always in a preemptible context that is safe
> to handle all accumulated messages. The possibility to pass the console
> lock owner actually makes things worse. The new owner might be in an atomic
> context and might cause softlockup when processing all messages accumulated
> when the console was suspended.
>
> Create new console_unlock_preemptible() that will not allow to pass
> the console lock owner. As a result, all accumulated messages will
> be proceed in the safe preemptible process.
>
> Use it in resume_console(). But it might be used also on other locations
> where console lock was used in preemptible context and many messages
> might get accumulated when the process was sleeping.
>
> Reported-by: Luo Jiaxing <luojiaxing@...wei.com>
> Suggested-by: Luo Jiaxing <luojiaxing@...wei.com>
> Signed-off-by: Petr Mladek <pmladek@...e.com>
> ---
>   include/linux/console.h |  1 +
>   kernel/printk/printk.c  | 39 +++++++++++++++++++++++++++++++++------
>   2 files changed, 34 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/console.h b/include/linux/console.h
> index 20874db50bc8..0c444c6448e8 100644
> --- a/include/linux/console.h
> +++ b/include/linux/console.h
> @@ -174,6 +174,7 @@ extern struct console *console_drivers;
>   extern void console_lock(void);
>   extern int console_trylock(void);
>   extern void console_unlock(void);
> +extern void console_unlock_preemptible(void);
>   extern void console_conditional_schedule(void);
>   extern void console_unblank(void);
>   extern void console_flush_on_panic(enum con_flush_mode mode);
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 421c35571797..a7e94c898646 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2425,7 +2425,7 @@ void resume_console(void)
>   		return;
>   	down_console_sem();
>   	console_suspended = 0;
> -	console_unlock();
> +	console_unlock_preemptible();
>   }
>   
>   /**
> @@ -2534,10 +2534,8 @@ static inline int can_use_console(void)
>    * the output prior to releasing the lock.
>    *
>    * If there is output waiting, we wake /dev/kmsg and syslog() users.
> - *
> - * console_unlock(); may be called from any context.
>    */
> -void console_unlock(void)
> +void __console_unlock(bool spinning_enabled)
>   {
>   	static char ext_text[CONSOLE_EXT_LOG_MAX];
>   	static char text[CONSOLE_LOG_MAX];
> @@ -2546,6 +2544,7 @@ void console_unlock(void)
>   	struct printk_info info;
>   	struct printk_record r;
>   
> +
>   	if (console_suspended) {
>   		up_console_sem();
>   		return;
> @@ -2637,13 +2636,15 @@ void console_unlock(void)
>   		 * finish. This task can not be preempted if there is a
>   		 * waiter waiting to take over.
>   		 */
> -		console_lock_spinning_enable();
> +		if (spinning_enabled)
> +			console_lock_spinning_enable();


change to


+               if (!spinning_enabled) {
+                       raw_spin_lock(&console_owner_lock);
+                       WRITE_ONCE(console_waiter, true);
+                       raw_spin_unlock(&console_owner_lock);
+               }


set console_waiter true here can prevent other thread from spinning.


>   
>   		stop_critical_timings();	/* don't trace print latency */
>   		call_console_drivers(ext_text, ext_len, text, len);
>   		start_critical_timings();
>   
> -		if (console_lock_spinning_disable_and_check()) {
> +		if (spinning_enabled &&
> +		    console_lock_spinning_disable_and_check()) {
>   			printk_safe_exit_irqrestore(flags);
>   			return;
>   		}


add some code to release console_waiter before return


@@ -2592,6 +2595,12 @@ void __console_unlock(bool spinning_enabled)
                         cond_resched();
         }

+       if (!spinning_enabled) {
+               raw_spin_lock(&console_owner_lock);
+               WRITE_ONCE(console_waiter, false);
+               raw_spin_unlock(&console_owner_lock);
+       }
+
         console_locked = 0;

Thanks

Jiaxing


> @@ -2670,8 +2671,34 @@ void console_unlock(void)
>   	if (retry && console_trylock())
>   		goto again;
>   }
> +
> +/*
> + * Classic console_unlock() that might be called in any context.
> + *
> + * It allows to pass the console lock owner when processing the buffered
> + * messages. It helps to prevent soft lockups in an atomic context.
> + */
> +void console_unlock()
> +{
> +	__console_unlock(true);
> +}
>   EXPORT_SYMBOL(console_unlock);
>   
> +/*
> + * Variant of the console unlock that can be called only in preemptible
> + * context.
> + *
> + * All messages are processed in this safe context. It helps to prevent
> + * softlockups when the console lock owner was passed to an atomic context.
> + */
> +void console_unlock_preemptible()
> +{
> +	lockdep_assert_preemption_enabled();
> +
> +	__console_unlock(false);
> +}
> +EXPORT_SYMBOL(console_unlock_preemptible);
> +
>   /**
>    * console_conditional_schedule - yield the CPU if required
>    *

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ