linux-kernel - Re: [PATCH v12 0/3] printk: Make printk() completely async

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160713111425.GG32373@pathway.suse.cz>
Date:	Wed, 13 Jul 2016 13:14:25 +0200
From:	Petr Mladek <pmladek@...e.com>
To:	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Jan Kara <jack@...e.com>, Tejun Heo <tj@...nel.org>,
	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
	linux-kernel@...r.kernel.org,
	Byungchul Park <byungchul.park@....com>,
	Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Subject: Re: [PATCH v12 0/3] printk: Make printk() completely async

On Wed 2016-07-13 16:42:42, Sergey Senozhatsky wrote:
> Hello,
> 
> On (07/12/16 18:28), Petr Mladek wrote:
> > I have tried the following:
> > 
> >        int a,b;
> > 
> >        for (a = 0; a < 1000; a++) {
> >                for (b = 0; b < 10; b++) {
> >                        pr_cont("item%04d", a*10 + b);
> >                }
> >                pr_cont("\n");
> >        }
> > 
> > 
> > [ 4913.000298] item8740item8741item8742item8743item8744item8745item8746item8747item8748item8749
> > [ 4913.000310] item8750item8751item8752item8753item8754item8755item8756item8757item8758item8759
> > [ 4913.000327] item8760item8761item8762item8763item8764item8765item8766item8767item8768item8769
> > [ 4913.000342] item8770item8771item8772item8773item8774item8775item8776item8777item8778item8779
> > [ 4913.000356] item8780
> > [ 4913.000357] item8781
> > [ 4913.000358] item8782
> > [...]
> 
> hm.. so are there any 'concurrent' printk()-s coming from other CPUs
> that are not getting printed on the console (because of loglevel restrictions),
> but still screw up the cont buffer?.... otherwise, my expectation was that in
> this particular case cpu issues a new pr_cont() only after it has printed
> the current message via call_console_drivers()->write(). so partially flushed
> cont buffer was not really expected to happen. I was wrong, obviously.

To be honest. I am not 100% sure what happens here. One theory is
that the printk kthread is waken because of some previous
non-continuous message. It steals console_lock() and partially flushes
the cont buffer. In this case, the pr_cont() calls are not able to get
console_lock() and basically work in the async mode. The pr_cont()
calls have to store each piece sepatately because the partially
flushed cont buffer is blocked until flushed completely.

Hmm, the strange thing is that I see this problem even when I force
the global synch more by

    echo Y >  /sys/module/printk/parameters/synchronous


I need to dig more into it.

> > Another solution would be to remember cont.cons outside
> > of struct cont. Then the cont buffer could be reused
> > immediately.
> 
> just an idea.
> ... or try to make KERN_CONT SMP-safe. there are many pr_cont() call
> sites. ACPI is one notable example. the others include OOM, some cgroup
> related output (or... was it memcg), etc., etc.
> 
> so we *may be* can have a per-cpu cont buffer and add new API
> pr_cont_begin()/pr_cont_end(), that would disable preemption.
> 
> 
> +	pr_cont_begin() /* preempt_disable() */
> 
> 	for (.....)
> 		pr_cont("%pS ....);
> 
> +	pr_cont_end() /* preempt_enable() */
>
> pr_cont_end() also can flush this CPU's cont buffer and store the log
> line. this will probably break very long cont lines (not sure if we
> have any of those though). and may be flush_on_panic() would have to
> do some extra work iterating each cpu.

It would work but I am a bit scared by the complexity. I think
that we should find a compromise between complexity and
reliability.

> > The saved position will be used either for the actual
> > cont buffer or for the very first message in the ring
> > buffer. The situation might be detected by comparing
> > console_seq, log_next_seq, and log_first_seq.
> 
> we still can have pr_cont() happening on several cpus simultaneously.
> console_seq is getting reset, when we register a new CON_PRINTBUFFER
> console.

I agree that mixing part of lines from different processes/cpus
is not ideal. But it is not much worse than mixing whole lines.
We should do a best effort but we do not need to be perfect.

I continue with scratching my head.

Best Regards,
Petr