lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 31 Dec 2017 10:44:48 +0900
From:   Sergey Senozhatsky <sergey.senozhatsky@...il.com>
To:     Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc:     sergey.senozhatsky.work@...il.com, rostedt@...dmis.org,
        tj@...nel.org, pmladek@...e.com, sergey.senozhatsky@...il.com,
        jack@...e.cz, akpm@...ux-foundation.org, peterz@...radead.org,
        rjw@...ysocki.net, pavel@....cz, linux-kernel@...r.kernel.org
Subject: Re: [RFC][PATCHv6 00/12] printk: introduce printing kernel thread

Hello,

On (12/29/17 22:59), Tetsuo Handa wrote:
[..]
> Just an idea: Do we really need to use a semaphore for console_sem?
> 
> Is it possible to replace it with a spinlock? Then, I feel that we can write
> to consoles from non-process context (i.e. soft or hard IRQ context), with
> write only one log (or even one byte) at a time (i.e. write one log from one
> context, and defer all remaining logs by "somehow" scheduling for calling
> that context again).
> 
> Since process context might fail to allow printk kernel thread to run for
> long period due to many threads waiting for run, I thought that interrupt
> context might fit better if we can "somehow" chain interrupt contexts.


that's a good question. printk(), indeed, does not care that much. but
the whole thing is more complex. I can copy-paste (sorry for that) one
of my previous emails to give a brief (I'm sure the description is
incomplete) idea.

====================

the real purpose of console_sem is to synchronize all events that can
happen to VT, fbcon, TTY, video, etc. and there are many events that
can happen to VT/fbcon. and some of those events can sleep - that's
where printk() can suffer. and this is why printk() is not different
from any  other console_sem users -- printk() uses that lock in order
to synchronize its own events: to have only one printing CPU, to prevent
concurrent console drivers list modification, to prevent concurrent consoles
modification, and so on.

let's take VT and fbcon for simplicity.

the events are.

1) IOCTL from user space
   they may involve things like resizing, scrolling, rotating,

   take a look at drivers/tty/vt/vt_ioctl.c  vt_ioctl().
   we need to take console_sem there because we modify the very
   important things - size, font maps, etc. we don't want those changes
   to mess with possibly active print outs happening from another CPUs.

2) timer events and workqueue events
   even cursor blinking must take console_sem. because it modifies the
   state of console/screen. take a look at drivers/video/fbdev/core/fbcon.c
   show_cursor_blink() for example.

   and take a look at fbcon_add_cursor_timer() in drivers/video/fbdev/core/fbcon.c

3) foreground console may change. video driver may be be initialized and
   registered.

4) PM events
   for exaple, drivers/video/fbdev/aty/radeon_pm.c   radeonfb_pci_suspend()

5) TTY write from user space
   when user space wants to write anything to console it goes through
   nTTY -> con_write() -> do_con_write().

 CPU: 1 PID: 1 Comm: systemd
 Call Trace:
  do_con_write+0x4c/0x1a5f
  con_write+0xa/0x1d
  n_tty_write+0xdb/0x3c5
  tty_write+0x191/0x223
  n_tty_receive_buf+0x8/0x8
  do_loop_readv_writev.part.23+0x58/0x89
  do_iter_write+0x98/0xb1
  vfs_writev+0x62/0x89

take a look at drivers/tty/vt/vt.c do_con_write()


it does a ton of things. why - because we need to scroll the console;
we need to wrap around the lines; we need to process control characters
- like \r or \n and so on and need to modify the console state accordingly;
we need to do UTF8/ASCII/etc. all of this things cannot run concurrently with
IOCTL that modify the font map or resize the console, or flip it, or rotate
it.

take a look at lf() -> con_scroll() -> fbcon_scroll() // drivers/video/fbdev/core/fbcon.c

we also don't want printk() to mess with do_con_write(). including
printk() from IRQ.

6) even more TTY
   I suspect that TTY may be invoked from IRQ.

7) printk() write  (and occasional ksmg_dump dumpers, e.g. arch/um/kernel/kmsg_dump)

   printk() goes through console_unlock()->vt_console_print().
   and it, basically, must handle all the things that TTY write does.
   handle console chars properly, do scrolling, wrapping, etc. and we
   don't want anthing else to jump in and mess with us at this stage.
   that's why we user console_sem in printk.c - to serialize all the
   events... including concurrent printk() from other CPUs. that's why
   we do console_trylock() in vprintk_emit().

8) even more printk()
   printk() can be called from IRQ. console_sem stops it if some of
   the consoles can't work in IRQ context right now.

9) consoles have notifiers

/*
 * We defer the timer blanking to work queue so it can take the console mutex
 * (console operations can still happen at irq time, but only from printk which
 * has the console mutex. Not perfect yet, but better than no locking
 */
static void blank_screen_t(unsigned long dummy)
{
        blank_timer_expired = 1;
        schedule_work(&console_work);
}

so console_sem is also used to, basically, synchronize IRQs/etc.

10) I suspect that some consoles can do things with console_sem from IRQ
   context.

and so on. we really use console_sem as a big-kernel-lock.


so where console_sem users might sleep? in tons of places...

like ark_pci_suspend()   console_lock(); mutex_lock(par);
or ark_pci_resume()       console_lock(); mutex_lock();
or con_install()          console_lock(); vc_allocate() -> kzalloc(GFP_KERNEL)

and so on and on and on.

and then there are paths that do

        mutex_lock(); schedule();
and another CPU does
        console_lock(); mutex_lock();

so it sleeps on mutex, with locked console_sem, and we can't even print
anything. printk() has to start losing messages at some point and nothing
can it help. except for flush on panic -- we don't care about console_sem
there.

printk() on its own can sleep with console_sem locked:
    - preemption in console_unlock() printing loop.

vprintk_emit()
 console_unlock()
  for (;;) {
    call_console_drivers();
    local_irq_restore()
    <<<<< preemption or cond_resched() >>>>>
  }

if the system is not healthy (OOM, etc.) then preemption in
console_unlock() can block printk messages for a very long
time.

====================

	-ss

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ