linux-kernel - Re: [PATCH] tty: vt: make do_con_write() no-op if IRQ is disabled

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2064700.cDd5PexU1D@localhost.localdomain>
Date:   Fri, 03 Dec 2021 15:51:58 +0100
From:   "Fabio M. De Francesco" <fmdefrancesco@...il.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Peter Hurley <peter@...leysoftware.com>,
        Jiri Slaby <jirislaby@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] tty: vt: make do_con_write() no-op if IRQ is disabled

On Friday, December 3, 2021 1:32:21 PM CET Tetsuo Handa wrote:
> On 2021/12/03 20:00, Fabio M. De Francesco wrote:
> > On Thursday, December 2, 2021 7:35:16 PM CET Linus Torvalds wrote:
> >> [...]
> >> Example: net/nfc/nci/uart.c. It does that
> >>
> >>         schedule_work(&nu->write_work);
> >>
> >> instead of actually trying to do a write from a wakeup routine
> >> (similar examples in ppp - "tasklet_schedule(&ap->tsk)" etc).
> >>
> >> I mean, it's called "wakeup", not "write". So I think the fundamental
> >> confusion here is in hdlc, not the tty layer.
> >>
> >>               Linus
> >>
> 
> OK.
> 
> > This is what I understand from the above argument: do a schedule_work() to get 
> > that n_hdlc_send_frames() started; in this way, n_hdlc_tty_wakeup() can
> > return to the caller and n_hdlc_send_frames() is executed asynchronously 
> > (i.e., no longer in an atomic context). 
> 
> Yes. 

Thanks for confirming :)

> [...]
>
> > [...]
> > 
> > Commit f9e053dcfc02 ("tty: Serialize tty flow control changes with flow_lock") 
> > has introduced spinlocks to serialize flow control changes and avoid the 
> > concurrent executions of __start_tty() and __stop_tty().
> > 
> > [...]
> >
> > This is the reason why we are dealing with this bug. Currently we have __start_tty() 
> > called with an acquired spinlock and IRQs disabled and the calls chain leads to 
> > console_lock() while in atomic context.
> 
> If we hit a race window described in that commit
> 
>     CPU 0                          | CPU 1
>     stop_tty()                     |
>       lock ctrl_lock               |
>       tty->stopped = 1             |
>       unlock ctrl_lock             |
>                                    | start_tty()
>                                    |   lock ctrl_lock
>                                    |   tty->stopped = 0
>                                    |   unlock ctrl_lock
>                                    |   driver->start()
>       driver->stop()               |
> 
>     In this case, the flow control state now indicates the tty has
>     been started, but the actual hardware state has actually been stopped.
> 
> , the tty->stopped flag remains 0 despite driver->stop() is called after
> driver->start() finished. tty->stopped (the flow control state) says "not stopped"
> but the actual hardware state is "stopped".

This is clear and it is the reason why I cited that commit. I understand that
__stop_tty() and __start_tty() cannot run concurrently.

I'm sorry if my poor abilities to express complex thoughts in English may 
have made you to think that I'm not aware of the real race condition issue 
that commit f9e053dcfc02 is trying to address :( 

Unfortunately, by fixing the SAC bug with a mere asynchronous execution of
n_hdlc_send_frames() in a work queue, we may still have different sources of 
race conditions. 

The point is that, when n_hdlc_tty_wakeup() returns to the IOCTL helper (the caller)
that acquired the spinlock, that same spinlock is immediately released without 
even knowing whether or not n_hdlc_send_frames() has had the chance to run 
and complete. Further __start_tty() or __stop_tty() calls are allowed to acquire that
spinlock again and we lose serialization because they still run concurrently. 

Actually, in the final part of your email you say that more than one instance of 
__start_tty() cannot run because of a variable that checks that another thread is 
still running the same code. I haven't yet checked, however you showed other
steps that can lead to races and I agree that the problem is not yet fixed.

This is why I think that n_hdlc_send_frames(), when runs asynchronously with
the change that Linus suggested, should also signal completions or change a 
a condition variable. 

Still using locks locks where it is needed, we should somehow use completions 
or condition variables to enforce an execution alternation between __stop_tty 
and __start_tty.

Either:

1) wait_for_completion() and complete();

Or:

2) wait_event(tty_event, tty->flow.tco_stopped == true) and 
wait_event(tty_event, tty->flow.tco_stopped == false) before entering respectively
__start_tty() and __stop_tty().

Regards,

Fabio M. De Francesco