lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 23 Mar 2016 09:37:25 +0900
From:	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
To:	Petr Mladek <pmladek@...e.com>
Cc:	Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Jan Kara <jack@...e.com>, Tejun Heo <tj@...nel.org>,
	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
	linux-kernel@...r.kernel.org,
	Byungchul Park <byungchul.park@....com>,
	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
	Jan Kara <jack@...e.cz>
Subject: Re: [RFC][PATCH v6 1/2] printk: Make printk() completely async

Hello Petr,

On (03/22/16 14:11), Petr Mladek wrote:
[..]
> > -	/* cpu currently holding logbuf_lock in this function */
> > -	static unsigned int logbuf_cpu = UINT_MAX;
> > +	bool in_panic = console_loglevel == CONSOLE_LOGLEVEL_MOTORMOUTH;
> > +	bool sync_print = printk_sync;
> 
> I still think that this local variable adds more mess than good.
> Please, rename it to do_printk_sync at least. It will a poor human
> to distinguish it from the global one ;-)

ok, I'll take a look :)

> > +	 * Set printing_func() sleep condition early, under the @logbuf_lock.
> > +	 * So printing kthread (if RUNNING) will go to console_lock() and spin
> > +	 * on @logbuf_lock.
> > +	 */
> > +	if (!printk_sync)
> > +		need_flush_console = true;
> 
> We set this variable for each call and also when printk_kthread is
> NULL or when sync_printk is false.

hm, yes. (printk_kthread && !need_flush_console) makes more sense.
so we it doesn't get re-dirty if already set.

> We migth want to clear it also from console_unlock(). I think that
> a good place would be in the check:
> 
> 	raw_spin_lock(&logbuf_lock);
> 	retry = console_seq != log_next_seq;
> 	raw_spin_unlock_irqrestore(&logbuf_lock, flags);

hm, what's wrong with clearing it in printk_kthread  printing function?

> > +	if (!sync_print) {
> > +		if (in_sched) {
> > +			/*
> > +			 * @in_sched messages may come too early, when we don't
> > +			 * yet have @printk_kthread. We can't print deferred
> > +			 * messages directly, because this may deadlock, route
> > +			 * them via IRQ context.
> > +			 */
> > +			__this_cpu_or(printk_pending,
> > +					PRINTK_PENDING_OUTPUT);
> > +			irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
> > +		} else if (printk_kthread && !in_panic) {
> > +			/* Offload printing to a schedulable context. */
> > +			wake_up_process(printk_kthread);
> > +		} else {
> > +			sync_print = true;
> > +		}
> > +	}
> >  	local_irq_restore(flags);
> 
> Please, what is the exact reason to call the above stuff before
> we release IRQs here? I guess that it is related to the discussions
> about possible lock debugging, infinite loops, ...
> 
> I wonder how the disabled IRQs help here. It is very likely that I
> miss something.

with IRQs enabled we have preemption enabled, so this thing

        __this_cpu_or(printk_pending,
                        PRINTK_PENDING_OUTPUT);
        irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));

can set pending and queue irq_work on different CPUs.

using "this_cpu = smp_processor_id()" which is taken right after
local_irq_save(flags) at the beginning of vprintk_emit() is not an
option - once we enable preemption CPUs are free to go offline. so
this thing wants to be under preemption disable(), and local_irq_save()
gives us it.

well, I can do

        + preempt_disable()
        + local_irq_restore()

        if (!sync_print) {
        ...
                __this_cpu_or()
                irq_work_queue()
        ...
        }
        - local_irq_restore()
        + preempt_enable()

but

> In each case, irq_work_queue() is lock-less and was used with IRQs
> enabled before.
> 
> So, it might be related to wake_up_process(). It takes a lock but
> using
> 
>        raw_spin_lock_irqsave(&p->pi_lock, flags);

wake_up_process disables IRQs in the beginning anyway.
so we have

        local_irq_restore()
        local_irq_save()

which is sort of nasty. disabling IRQs can be a bit expensive, not
that printk is that hot path, but I wanted to avoid additional latencies
that can happen due to IRQ enable-disable ping-pong.

> , so it disables IRQs in the critical section. Do we need to guard
> it in between?
> 
> I think that you actually wanted to disable lockdep or any other lock
> debugging, instead.

preemption.

well, disabling lockdep is may be a good thing to do as well...

> 	if (in_sched) {
> 		/*
> 		 * @in_sched messages may come too early, when we don't
> 		 * yet have @printk_kthread. We can't print deferred
> 		 * messages directly, because this may deadlock, route
> 		 * them via IRQ context.
> 		 */
> 		__this_cpu_or(printk_pending,
> 			      PRINTK_PENDING_OUTPUT);
> 		irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
> 		goto out;
> 	}
> 
> 	/* Avoid printk recursion */
> 	lockdep_off();
> 
> 	if  (printk_kthread && !in_panic) {
> 		/* Offload printing to a schedulable context. */
> 		wake_up_process(printk_kthread);
> 		goto out_lockdep;
> 	} else {
> 		/*
> 		 * Try to acquire and then immediately release the console
> 		 * semaphore.  The release will print out buffers and wake up
> 		 * /dev/kmsg and syslog() users.
> 		 */
> 		if (console_trylock())
> 			console_unlock();
> 	}
> 
> 	lockdep_on();
> 

ok, I'll take a look.

eventually (after 0003) vprintk_emit() is

	if (in_sched) {
		__this_cpu_or(printk_pending,
				PRINTK_PENDING_OUTPUT);
		irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
	}
	local_irq_restore(flags);
	if (!in_sched) {
		lockdep_off();
		if (console_trylock())
			console_unlock();
		lockdep_on();
	}

> I do not say that it is a "dream-of-like" code. One important thing for
> me is that it does not use "sync_printk" variable.
> 
> You original code modified "sync_printk" according to "in_sched" and
> "in_panic" variables earlier in vprintk_emit. Then it again checked
> all three variables here which produced strange twists in my head ;-)

yeah, I can see your point. a bunch of goto-s can probably help here.

	-ss

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ