linux-kernel - Re: [PATCH] mm: don't warn about allocations which stall for too long

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20171102125528.4upg5eaw7cgxmak6@dhcp22.suse.cz>
Date:   Thu, 2 Nov 2017 13:55:28 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        akpm@...ux-foundation.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Cong Wang <xiyou.wangcong@...il.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Mel Gorman <mgorman@...e.de>, Petr Mladek <pmladek@...e.com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        "yuwang.yuwang" <yuwang.yuwang@...baba-inc.com>
Subject: Re: [PATCH] mm: don't warn about allocations which stall for too long

On Tue 31-10-17 15:32:25, Steven Rostedt wrote:
> 
> Thank you for the perfect timing. You posted this the day after I
> proposed a new solution at Kernel Summit in Prague for the printk lock
> loop that you experienced here.
> 
> I attached the pdf that I used for that discussion (ignore the last
> slide, it was left over and I never went there).
> 
> My proposal is to do something like this with printk:
> 
> Three types of printk usages:
> 
> 1) Active printer (actively writing to the console).
> 2) Waiter (active printer, first user)
> 3) Sees active printer and a waiter, and just adds to the log buffer
>    and leaves.
> 
> (new globals)
> static DEFINE_SPIN_LOCK(console_owner_lock);
> static struct task_struct console_owner;
> static bool waiter;
> 
> console_unlock() {
> 
> [ Assumes this part can not preempt ]
> 
> 	spin_lock(console_owner_lock);
> 	console_owner = current;
> 	spin_unlock(console_owner_lock);
> 
> 	for each message
> 		write message out to console
> 
> 		if (READ_ONCE(waiter))
> 			break;
> 
> 	spin_lock(console_owner_lock);
> 	console_owner = NULL;
> 	spin_unlock(console_owner_lock);
> 
> [ preemption possible ]
> 
> 	[ Needs to make sure waiter gets semaphore ]
> 
> 	up(console_sem);
> }
> 
> 
> Then printk can have something like:
> 
> 
> 	if (console_trylock())
> 		console_unlock();
> 	else {
> 		struct task_struct *owner = NULL;
> 
> 		spin_lock(console_owner_lock);
> 		if (waiter)
> 			goto out;
> 		WRITE_ONCE(waiter, true);
> 		owner = READ_ONCE(console_owner);		
> 	out:
> 		spin_unlock(console_owner_lock);
> 		if (owner) {
> 			while (!console_trylock())	
> 				cpu_relax();
> 			spin_lock(console_owner_lock);
> 			waiter = false;
> 			spin_unlock(console_owner_lock);
> 		}
> 	}
> 
> This way, only one CPU spins waiting to print, and only if the
> console_lock owner is actively printing. If the console_lock owner
> notices someone is waiting to print, it stops printing as a waiter will
> always continue the prints. This will balance out the printks among all
> the CPUs that are doing them and no one CPU will get stuck doing all
> the printks.
> 
> This would solve your issue because the next warn_alloc() caller would
> become the waiter, and take over the next message in the queue. This
> would spread out the load of who does the actual printing, and not have
> one printer be stuck doing the work.

That also means that we would shift the overhead only to the first
waiter AFAIU. What if we have floods of warn_alloc from all CPUs?
Something that Tetsuo's test case simulates.

As Petr pointed out earlier in the thread, I do not think this is going
to help cosiderably and offloading to a kernel thread sounds like a
more viable option. It sounds really wrong to have printk basically
indeterministic wrt. call duration depending on who happens to do the
actual work. Either we make the call sync or completely offloaded to
a dedicated kernel thread and make sure that the buffer gets flushed
unconditionally on panic. I haven't been following all the printk
discussion recently so maybe this has been discussed and deemed not
viable for implementation details but in principle this should work.
-- 
Michal Hocko
SUSE Labs