lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160106033830.GB781@swordfish>
Date:	Wed, 6 Jan 2016 12:38:30 +0900
From:	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
To:	Jan Kara <jack@...e.cz>
Cc:	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Petr Mladek <pmladek@...e.cz>,
	KY Sri nivasan <kys@...rosoft.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	linux-kernel@...r.kernel.org,
	Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Subject: Re: [PATCH 1/7] printk: Hand over printing to console if printing
 too long

Hello,

On (01/05/16 15:48), Jan Kara wrote:
> > [..]
> > > cond_resched() does its job there, of course. well, a user process still can
> > > do a lot of call_console_drivers() calls. may be we can check who is calling
> > > console_unlock() and if we have "!printk_sync && !oops_in_progress" (or just printk_sync
> > > test) AND a user process then return from console_unlock() doing irq_work_queue()
> > > and set PRINTK_PENDING_OUTPUT pending bit, the way vprintk_emit() does it.
> > 
> > attached two patches, I ended up having on top of yours. just in case.
> > 
> >     printk: factor out can_printk_async()
> >     
> >     console_unlock() can be called directly or indirectly by a user
> >     space process, so it can end up doing call_console_drivers() loop,
> >     which will hold it from returning back to user-space from a syscall
> >     for unpredictable amount of time.
> >     
> >     Factor out can_printk_async() function, which queues an irq work and
> >     sets a PRINTK_PENDING_OUTPUT pending bit (if we can do async printk).
> >     vprintk_emit() already does it, add can_printk_async() call to
> >     console_unlock() for !PF_KTHREAD processes.
> 
> I'd be cautious about changing this userspace visible behavior. Someone may
> be relying on it... I agree that sometimes we can block userspace process
> in kernel for a long time (e.g. in my testing I often see syslog process
> doing the printing) but so far I didn't see / was notified about some real
> problem with this. So unless I see some real user issues with user
> processes doing printing for too long I would not touch this.

hm, interesting point.

/* random thoughts, I'm still on sick leave */

do we really have a user visible behaviour here that it really wants to have?
a task that does tty_open, for instance, hardly wants to end up doing a bunch
of call_console_drivers() calls in console_unlock(). it does look to me mostly
as unexpected side effect.

I probably can imagine someone writing a /usr/bin/flush_logs_to_serial
app that specifically depends on that behaviour, but that will require a bit
of hackery and trickery, since (seems) there is no syscall this app can call
that will perform *only* the required action:

	void force_flush_logs_to_serial(void)
	{
		console_lock();
		console_unlock();
	}

returning back to user space from syscalls quicker is a good thing, I'd
prefer user space apps to do less kernel job (by kernel job I mean
call_console_drivers() loop). well, at least on admittedly weird setups
that I have to deal with. but I may be missing something here.

some numbers

I added global `unsigned long k_ts, u_ts;' to accumulate time spent
in console_unlock() by PF_KTHREAD and !PF_KTHREAD correspondingly.

void console_unlock(void)
{
	...
	s_ts = local_clock();
	console_cont_flush(text, sizeof(text));
again:

	...

	if (wake_klogd)
		wake_up_klogd();

	e_ts = local_clock();
	if (time_after(e_ts, s_ts)) {
		if (current->flags & PF_KTHREAD)
			k_ts += (e_ts - s_ts);
		else
			u_ts += (e_ts - s_ts);
	}
}

and a procfs file to read the values

         unsigned long k = k_ts;
         unsigned long u = u_ts;
         unsigned long rem_nsec_k = do_div(k, 1000000000);
         unsigned long rem_nsec_u = do_div(u, 1000000000);

         return sprintf(buf, "kern:[%lu.%06lu] user:[%lu.%06lu]\n",
                         k, rem_nsec_k / 1000,
                         u, rem_nsec_u / 1000);

and w/o a lot of effort (no heavy printk message traffic)

$ cat /proc/1/time_in_console_unlock
kern:[4.241475] user:[4.077787]

that's user space spent almost the same amount of time to print kernel
messages as the kernel did on its own. which is hard to formulate as an
issue, it's just user space was doing for 4 seconds something it was not
really meant to do (at least from user space app developer's point of
view); so there is an unpredictable additional cost X added to some of
the syscalls.

	-ss

> >     printk: introduce console_sync_mode
> >     
> >     console_sync_mode() should be called early in panic() to switch
> >     printk from async mode to sync. Otherwise, STOP IPIs can arrive
> >     to other CPUs too late and those CPUs will see oops_in_progress
> >     being 0 again.
> 
> So as I wrote, I like this in principle but there are much more places
> calling console_verbose() and all of them want console_sync_mode() as well.
> So I prefer hiding the sync printing in console_verbose() and possibly
> renaming it to something better but I'm not sure renaming is worth it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ