lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YiuX8rFkuGpn7gqv@alley>
Date:   Fri, 11 Mar 2022 19:41:54 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     John Ogness <john.ogness@...utronix.de>
Cc:     Sergey Senozhatsky <senozhatsky@...omium.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [PATCH printk v1 11/13] printk: reimplement console_lock for
 proper kthread support

On Fri 2022-03-11 14:34:40, John Ogness wrote:
> On 2022-03-11, Petr Mladek <pmladek@...e.com> wrote:
> >>     console_unlock()
> >>     {
> >>  	  [...]
> >>  	  if (may_schedule)
> >>  	      retry = console_lock_reacquire();
> >>  	  else
> >>  	      retry = console_trylock();
> >>     }
> >> 
> 
> [...]
> 
> > OK, it means that the main problem here _is not_ the scheduling context,
> > console_lock() vs. console_trylock(). The main problem _is_ the direct
> > printing vs. the offload to kthreads.
> >
> > Of course, the context is important. It affects how we could re-take
> > the lock. But the main problem is the printing mode. We must make sure
> > that:
> >
> >     1. someone is printing pending messages when the direct mode is needed
> 
> console_trylock() causes difficulties here because it will fail if any
> kthread is active. It is an example of direct mode failure. But is that
> really any different than current mainline console_trylock() failing
> because a console_lock() context is active (and possibly not scheduled
> on a CPU)?

I see. Well, console_lock() does not mean that we are in direct mode.
Yes, it stops kthreads. But it might be called by register_console()
or tty driver. console_unlock() might do nothing when
allow_direct_printing() returns false in console_flush_all().


> >     2. kthreads are woken and can enter the printing mode when the direct
> >        mode is disabled.
> 
> This happens at the end of vprintk_emit() and within __console_unlock(),
> regardless if the printk() was running in direct mode or not.
> 
> > Will console_lock_reacquire() really help here?
> >
> > The API theoretically helps in direct mode when the lock was taken
> > via console_lock().
> 
> console_lock_reacquire() only exists for the console_lock() case.
> 
> > But it does not help when the lock was taken
> > via console_trylock() from printk(). It might mean that
> > the forward progress might not be guaranteed in the direct mode
> > (early boot, panic, ...).
> 
> How is the console_trylock() case different from current mainline now?
> As I mentioned above, the kthreads can block console_trylock(), but so
> can a console_lock() currently in mainline.

It is very different. Failing console_trylock() in the current mainline
means that someone else is responsible for printing all the pending
messages. All should be on the safe side.

Failing console_trylock() in this patch means that some printk
kthread is active but nothing else. The kthread might handle one
message and then go into sleep because someone requested the
direct mode using atomic_inc(). Other kthreads might be sleeping
and ignore everything. The failing console_trylock() will mean
that nobody will handle the rest of the messages. The race
is described in the other reply.

Honestly, handling console_trylock() case looks more important to me
because it is used by printk(). It is the main usecase. While
console_lock() code path should be used rarely, during
console registration. Well, it is used also by tty code but
I am not sure how often it is really called there.

> > Hmm, the forward progress seems to be guaranteed in the direct
> > mode most of the time. console_trylock() can take over
> > the atomic counter because console kthreads are not allowed
> > to enter the printing mode in this case.
> >
> > I used "most of the time" because there might be races when
> > the mode is switched. "printk_direct" is an atomic variable.
> > CON_DIRECT is set under con->mutex but console_trylock()
> > does not take the mutex...
> 
> You are mixing issues here. If CON_DIRECT is set, there is already a
> console_lock() in progress, so console_trylock() fails on @console_sem.

Yes, I am sorry for the confusion. I forgot the meaning of the flag.
I thought that it was set when the direct mode was requested.
But it is not true. It only means that console_lock is taken
and it could do the direct printing.

It would make sense to rename it. It might be part of all the
confusion here. The name creates false feeling about that
the direct mode is requested. I think that I have actually
suggested renaming the flag in an earlier mail.

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ