linux-kernel - Re: buffer write race: Re: [PATCH printk v1 09/18] printk: nobkl: Add print state functions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZCV4c+Sbywttsq/v@alley>
Date:   Thu, 30 Mar 2023 13:54:27 +0200
From:   Petr Mladek <pmladek@...e.com>
To:     John Ogness <john.ogness@...utronix.de>
Cc:     Sergey Senozhatsky <senozhatsky@...omium.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: buffer write race: Re: [PATCH printk v1 09/18] printk: nobkl:
 Add print state functions

On Wed 2023-03-29 16:39:54, John Ogness wrote:
> On 2023-03-29, Petr Mladek <pmladek@...e.com> wrote:
> >> +/**
> >> + * console_can_proceed - Check whether printing can proceed
> >> + * @wctxt:	The write context that was handed to the write function
> >> + *
> >> + * Returns:	True if the state is correct. False if a handover
> >> + *		has been requested or if the console was taken
> >> + *		over.
> >> + *
> >> + * Must be invoked after the record was dumped into the assigned record
> >> + * buffer
> >
> > The word "after" made me think about possible races when the record
> > buffer is being filled. The owner might loose the lock a hostile
> > way during this action. And we should prevent using the same buffer
> > when the other owner is still modifying the content.
> >
> > It should be safe when the same buffer might be used only by nested
> > contexts. It does not matter if the outer context finishes writing
> > later. The nested context should not need the buffer anymore.
> >
> > But a problem might happen when the same buffer is shared between
> > more non-nested contexts. One context might loose the lock a hostile way.
> > The other context might get the access after the hostile context
> > released the lock.
> 
> Hostile takeovers _only occur during panic_.
>
> > NORMAL and PANIC contexts are safe. These priorities have only
> > one context and both have their own buffers.
> >
> > A problem might be with EMERGENCY contexts. Each CPU might have
> > its own EMERGENCY context. We might prevent this problem if
> > we do not allow to acquire the lock in EMERGENCY (and NORMAL)
> > context when panic() is running or after the first hostile
> > takeover.
> 
> A hostile takeover means a CPU took ownership with PANIC priority. No
> CPU can steal ownership from the PANIC owner. Once the PANIC owner
> releases ownership, the panic message has been output to the atomic
> consoles. Do we really care what happens after that?

I see. The hostile take over is allowed only in
cons_atomic_exit(CONS_PRIO_PANIC, prev_prio) that is called at the
very end of panic() before the infinite blinking.

It is true that we do not care at this moment. It is actually called
after "suppress_printk = 1;" so that there should not be any
new messages.

Anyway, it would be nice to document this subtle race somewhere.
I could imagine that people would want to risk the hostile
takeover even earlier so the race might get introduced.

Best Regards,
Petr