[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5d055fbd-e94a-fe54-d3e0-982dc455ed1a@i-love.sakura.ne.jp>
Date: Fri, 3 Dec 2021 21:32:21 +0900
From: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
To: "Fabio M. De Francesco" <fmdefrancesco@...il.com>,
Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Peter Hurley <peter@...leysoftware.com>,
Jiri Slaby <jirislaby@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] tty: vt: make do_con_write() no-op if IRQ is disabled
On 2021/12/03 20:00, Fabio M. De Francesco wrote:
> On Thursday, December 2, 2021 7:35:16 PM CET Linus Torvalds wrote:
>> On Thu, Dec 2, 2021 at 7:41 AM Tetsuo Handa
>> <penguin-kernel@...ove.sakura.ne.jp> wrote:
>>>
>>>> Looking at the backtrace, I see
>>>>
>>>> n_hdlc_send_frames+0x24b/0x490 drivers/tty/n_hdlc.c:290
>>>> tty_wakeup+0xe1/0x120 drivers/tty/tty_io.c:534
>>>> __start_tty drivers/tty/tty_io.c:806 [inline]
>>>> __start_tty+0xfb/0x130 drivers/tty/tty_io.c:799
>>>>
>>>> and apparently it's that hdlc line discipline (and
>>>> n_hdlc_send_frames() in particular) that is the problem here.
>>>>
>>>> I think that's where the fix should be.
>>>
>>> Do you mean that we should change the behavior of n_hdlc_send_frames()
>>> rather than trying to make __start_tty() schedulable again?
>>
>> I wouldn't change n_hdlc_send_frames() itself. It does what it says it does.
>>
>> But n_hdlc_tty_wakeup() probably shouldn't call it directly. Other tty
>> line disciplines don't do that kind of thing - although I only looked
>> at a couple. They all seem to just set bits and prepare things. Like a
>> wakeup function should do.
>>
>> So I think n_hdlc_tty_wakeup() should perhaps only do a
>> "schedule_work()" or similar to get that n_hdlc_send_frames() started,
>> rather than doing it itself.
>>
>> Example: net/nfc/nci/uart.c. It does that
>>
>> schedule_work(&nu->write_work);
>>
>> instead of actually trying to do a write from a wakeup routine
>> (similar examples in ppp - "tasklet_schedule(&ap->tsk)" etc).
>>
>> I mean, it's called "wakeup", not "write". So I think the fundamental
>> confusion here is in hdlc, not the tty layer.
>>
>> Linus
>>
OK.
> This is what I understand from the above argument: do a schedule_work() to get
> that n_hdlc_send_frames() started; in this way, n_hdlc_tty_wakeup() can
> return to the caller and n_hdlc_send_frames() is executed asynchronously
> (i.e., no longer in an atomic context).
Yes. If we copy how net/nfc/nci/uart.c does, the fix would look like:
--------------------
drivers/tty/n_hdlc.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/drivers/tty/n_hdlc.c b/drivers/tty/n_hdlc.c
index 7e0884ecc74f..a71fcac60925 100644
--- a/drivers/tty/n_hdlc.c
+++ b/drivers/tty/n_hdlc.c
@@ -140,6 +140,8 @@ struct n_hdlc {
struct n_hdlc_buf_list rx_buf_list;
struct n_hdlc_buf_list tx_free_buf_list;
struct n_hdlc_buf_list rx_free_buf_list;
+ struct work_struct write_work;
+ struct tty_struct *tty_for_write_work;
};
/*
@@ -210,6 +212,8 @@ static void n_hdlc_tty_close(struct tty_struct *tty)
wake_up_interruptible(&tty->read_wait);
wake_up_interruptible(&tty->write_wait);
+ cancel_work_sync(&n_hdlc->write_work);
+
n_hdlc_free_buf_list(&n_hdlc->rx_free_buf_list);
n_hdlc_free_buf_list(&n_hdlc->tx_free_buf_list);
n_hdlc_free_buf_list(&n_hdlc->rx_buf_list);
@@ -334,6 +338,20 @@ static void n_hdlc_send_frames(struct n_hdlc *n_hdlc, struct tty_struct *tty)
goto check_again;
} /* end of n_hdlc_send_frames() */
+/**
+ * n_hdlc_tty_write_work - Asynchronous callback for transmit wakeup
+ * @work: pointer to work_struct
+ *
+ * Called when low level device driver can accept more send data.
+ */
+static void n_hdlc_tty_write_work(struct work_struct *work)
+{
+ struct n_hdlc *n_hdlc = container_of(work, struct n_hdlc, write_work);
+ struct tty_struct *tty = n_hdlc->tty_for_write_work;
+
+ n_hdlc_send_frames(n_hdlc, tty);
+} /* end of n_hdlc_tty_write_work() */
+
/**
* n_hdlc_tty_wakeup - Callback for transmit wakeup
* @tty: pointer to associated tty instance data
@@ -344,7 +362,8 @@ static void n_hdlc_tty_wakeup(struct tty_struct *tty)
{
struct n_hdlc *n_hdlc = tty->disc_data;
- n_hdlc_send_frames(n_hdlc, tty);
+ n_hdlc->tty_for_write_work = tty;
+ schedule_work(&n_hdlc->write_work);
} /* end of n_hdlc_tty_wakeup() */
/**
@@ -706,6 +725,7 @@ static struct n_hdlc *n_hdlc_alloc(void)
if (!n_hdlc)
return NULL;
+ INIT_WORK(&n_hdlc->write_work, n_hdlc_tty_write_work);
spin_lock_init(&n_hdlc->rx_free_buf_list.spinlock);
spin_lock_init(&n_hdlc->tx_free_buf_list.spinlock);
spin_lock_init(&n_hdlc->rx_buf_list.spinlock);
--------------------
>
> I hope that I'm not missing something. If the above summary is correct,
> please forgive a newbie for the following questions...
>
> Commit f9e053dcfc02 ("tty: Serialize tty flow control changes with flow_lock")
> has introduced spinlocks to serialize flow control changes and avoid the
> concurrent executions of __start_tty() and __stop_tty().
>
> This is an excerpt from the above-mentioned commit:
>
> ->
> Introduce tty->flow_lock spinlock to serialize tty flow control changes.
> Split out unlocked __start_tty()/__stop_tty() flavors for use by
> ioctl(TCXONC) in follow-on patch.
> <-
>
> This is the reason why we are dealing with this bug. Currently we have __start_tty()
> called with an acquired spinlock and IRQs disabled and the calls chain leads to
> console_lock() while in atomic context.
If we hit a race window described in that commit
CPU 0 | CPU 1
stop_tty() |
lock ctrl_lock |
tty->stopped = 1 |
unlock ctrl_lock |
| start_tty()
| lock ctrl_lock
| tty->stopped = 0
| unlock ctrl_lock
| driver->start()
driver->stop() |
In this case, the flow control state now indicates the tty has
been started, but the actual hardware state has actually been stopped.
, the tty->stopped flag remains 0 despite driver->stop() is called after
driver->start() finished. tty->stopped (the flow control state) says "not stopped"
but the actual hardware state is "stopped".
>
> In summation, my questions are...
>
> 1) Why do we still need to protect __start_tty() and __stop_tty() with spin_lock_irq()
> if the solution to the bug is to execute n_hdlc_send_frames() asynchronously?
Without serialization, tty->stopped flag and the actual hardware state can mismatch.
>
> 2) If it is true that we need to avoid concurrent executions of __start_tty() and
> __stop_tty(), can we just use a Mutex in the IOCTL's helper?
Yes if all __start_tty() and __stop_tty() callers were schedulable context.
But stop_tty() says that stop_tty() might be called from atomic context.
Thus, we can't use a mutex for protecting tty->stopped flag.
>
> Thanks,
>
> Fabio M. De Francesco
By the way, even with above patch, I think
CPU 0 | CPU 1 | CPU 2
stop_tty() | |
lock flow.lock | |
tty->stopped = 1 | |
driver->stop() | |
unlock flow.lock | |
| start_tty() |
| lock flow.lock |
| tty->stopped = 0 |
| driver->start() => Schedules n_hdlc_send_frames()
| unlock flow.lock |
stop_tty() | |
lock flow.lock | |
tty->stopped = 1 | |
driver->stop() | |
unlock flow.lock | |
| | Starts n_hdlc_send_frames()
(that is, the n_hdlc is writing to consoles despite tty->stopped is 1) can happen
until n_hdlc_send_frames() completes.
Then, even scheduling next n_hdlc_send_frames() while previous n_hdlc_send_frames() is
possible? In the worst case, multiple CPUs can run n_hdlc_send_frames() concurrently?
CPU 0 | CPU 1 | CPU 2 | CPU 3
stop_tty() | | |
lock flow.lock | | |
tty->stopped = 1 | | |
driver->stop() | | |
unlock flow.lock | | |
| start_tty() | |
| lock flow.lock | |
| tty->stopped = 0 | |
| driver->start() => Schedules n_hdlc_send_frames()
| unlock flow.lock | |
| | Starts n_hdlc_send_frames()
stop_tty() | | |
lock flow.lock | | |
tty->stopped = 1 | | |
driver->stop() | | |
unlock flow.lock | | |
| start_tty() | |
| lock flow.lock | |
| tty->stopped = 0 | |
| driver->start() => Schedules n_hdlc_send_frames()
| unlock flow.lock | |
| | | Starts n_hdlc_send_frames()
Ah, OK. n_hdlc->tbusy is there for serialization.
Powered by blists - more mailing lists