lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZiE2VO9Q03TvHJ_t@pathway.suse.cz>
Date: Thu, 18 Apr 2024 17:03:48 +0200
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [PATCH printk v4 06/27] printk: nbcon: Add callbacks to
 synchronize with driver

On Thu 2024-04-18 14:16:16, John Ogness wrote:
> On 2024-04-18, Petr Mladek <pmladek@...e.com> wrote:
> > I am not sure how it is done in other parts of kernel code where
> > RT needed to introduce some tricks. But I think that we should
> > really start mentioning RT behavior in the commit messages and
> > and comments where the RT mode makes huge changes.
> 
> Yes, our motivation is RT. But these semantics are not RT-specific. They
> apply to the general kernel locking model.

Yes, but RT is a nice example where it is clear what want to achieve.
IMHO, a clear example is always better then a scientific formulation
where every word might be important. Especially when different people
might understand some words different ways.


> For example, even for a !RT system, it is semantically incorrect to
> take a spin_lock while holding a raw_spin_lock.

Really? I am not aware of it. I know that lockdep complains even
in no-RT configuration. But I have expected that it only helps
to catch potential problems when the same code is used with
RT enabled.

Is there any difference between spin_lock() and raw_spin_lock()
when RT is disabled. I do not see any. This is from
include/linux/spinlock.h:

	/* Non PREEMPT_RT kernel, map to raw spinlocks: */
	#ifndef CONFIG_PREEMPT_RT
	[...]
	static __always_inline void spin_lock(spinlock_t *lock)
	{
		raw_spin_lock(&lock->rlock);
	}

Would raw_spinlock() API exist without CONFIG_PREEMPT_RT?

Maybe, you do not understand what I suggest. Let's talk about
particular comments in the code.


> In the full PREEMPT_RT series I have tried to be careful about only
> mentioning PREEMPT_RT when it is really PREEMPT_RT-specific. For example
> [0][1][2].
> 
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/commit/?h=linux-6.9.y-rt-rebase&id=1564af55a92c32fe215af35cf55cb9359c5fff30
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/commit/?h=linux-6.9.y-rt-rebase&id=033b416ad25b17dc60d5f71c1a0b33a5fbc17639
> 
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/commit/?h=linux-6.9.y-rt-rebase&id=7929ba9e5c110148a1fcd8bd93d6a4eff37aa265
> 
> > The race could NOT happen in:
> >
> >    + NBCON_PRIO_PANIC context because it does not schedule
> 
> Yes.
> 
> >    + NBCON_PRIO_EMERGENCY context because we explicitly disable
> >      preemption there
> 
> Yes.
> 
> >    + NBCON_NORMAL_PRIO context when we ALWAYS do nbcon_try_acquire()
> >      under con->device() lock. Here the con->device_lock() serializes
> >      nbcon_try_acquire() calls even between running tasks.
> 
> The nbcon_legacy_emit_next_record() printing as NBCON_NORMAL_PRIO is a
> special situation where write_atomic() is used. It is safe because it
> disables hard interrupts and is never called from NMI context.
> 
> nbcon_atomic_flush_pending() as NBCON_NORMAL_PRIO is safe in !NMI
> because it also disables hard interrupts. However,
> nbcon_atomic_flush_pending() could be called in NMI with
> NBCON_NORMAL_PRIO. I need to think about this case.

It is safe. The race scenario requires _double_ scheduling (A->B->A):

 1. [CPU 0]: process A acquires the context and is scheduled (CPU 0)

 2. [CPU 1] The nbcon context is taken over and released in emergency.

 3. [CPU 0] process B acquires the context and is scheduled

 4. [CPU 0] process A thinks that it still owns the context
	    and continue when it ended.


This could not happen with the current code when:

   + nbcon_try_acquire() is serialized by con->device_lock()
     because process B would get blocked on this lock.

   + nbcon_try_acquire() is called in atomic context
     because the context is always released before scheduling.


I would say that this is far from obvious and we really need
to document this somehow. I would mention these details above
nbcon_context_try_acquire().

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ