linux-kernel - Re: [RFC][PATCH v2 1/2] printk: Make printk() completely async

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 7 Mar 2016 11:52:48 +0100
From:	Jan Kara <jack@...e.cz>
To:	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Cc:	Jan Kara <jack@...e.cz>,
	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
	akpm@...ux-foundation.org, jack@...e.com, pmladek@...e.com,
	tj@...nel.org, linux-kernel@...r.kernel.org,
	sergey.senozhatsky@...il.com
Subject: Re: [RFC][PATCH v2 1/2] printk: Make printk() completely async

On Mon 07-03-16 19:12:33, Sergey Senozhatsky wrote:
> Hello,
> 
> On (03/07/16 09:22), Jan Kara wrote:
> [..]
> > > hm, just for note, none of system-wide wqs seem to have a ->rescuer thread
> > > (WQ_MEM_RECLAIM).
> > > 
> > > [..]
> > > > Even if you use printk_wq with WQ_MEM_RECLAIM for printing_work work item,
> > > > printing_work_func() will not be called until current work item calls
> > > > schedule_timeout_*(). That will be an undesirable random delay. If you use
> > > > a dedicated kernel thread rather than a dedicated workqueue with WQ_MEM_RECLAIM,
> > > > we can avoid this random delay.
> > > 
> > > hm. yes, seems that it may take some time until workqueue wakeup() a ->rescuer thread.
> > > need to look more.
> > 
> > Yes, it takes some time (0.1s or 2 jiffies) before workqueue code gives up
> > creating a worker process and wakes up rescuer thread. However I don't see
> > that as a problem...
> 
> yes, that's why I asked Tetsuo whether his concern was a wq's MAYDAY timer
> delay. the two commits that Tetsuo pointed at earlier in he loop (373ccbe59270
> and 564e81a57f97) solved the problem by switching to WQ_MEM_RECLAIM wq.
> I've slightly tested OOM-kill on my desktop system and haven't spotted any
> printk delays (well, a test on desktop is not really representative, of
> course).
> 
> 
> the only thing that so far grabbed my attention - is
> 
> 	__this_cpu_or(printk_pending)
> 	irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
> 
> a _theoretical_ corner case here is when we have only one CPU doing a bunch
> of printk()s and this CPUs disables irqs in advance
> 	local_irq_save
> 	for (...)
> 		printk()
> 	local_irq_restore()
> 
> if no other CPUs see `printk_pending' then nothing will be printed up
> until local_irq_restore() (assuming that IRQ disable time is withing
> the hardlockup detection threshold). if any other CPUs concurrently
> execute printk then we are fine, but
> 	a) if none -- then we probably have a small change in behaviour
> and
> 	b) UP systems

So for UP systems, we should by default disable async printing anyway I
suppose. It is just a pointless overhead. So please just make printk_sync
default to true if !CONFIG_SMP.

When IRQs are disabled, you're right we will have a change in behavior. I
don't see an easy way of avoiding delaying of printk until IRQs get
enabled. I don't want to queue work directly because that creates
possibility for lock recursion in queue_work(). And playing some tricks
with irq_works isn't easy either - you cannot actually rely on any other
CPU doing anything (even a timer tick) because of NOHZ.

So if this will be a problem in practice, using a kthread will probably be
the easiest solution.

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR