[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c9c18419-ff32-6c87-195a-923ebb28c466@i-love.sakura.ne.jp>
Date: Thu, 9 Dec 2021 22:18:36 +0900
From: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: "Fabio M. De Francesco" <fmdefrancesco@...il.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jiri Slaby <jirislaby@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
syzbot+5f47a8cea6a12b77a876@...kaller.appspotmail.com,
Marco Elver <elver@...gle.com>,
Max Filippov <jcmvbkbc@...il.com>,
David Sterba <dsterba@...e.com>,
Bhaskar Chowdhury <unixbhaskar@...il.com>,
nick black <dankamongmen@...il.com>,
Igor Matheus Andrade Torrente <igormtorrente@...il.com>
Subject: Re: [PATCH] tty: n_hdlc: make n_hdlc_tty_wakeup() asynchronous
On 2021/12/07 3:07, Linus Torvalds wrote:
> On Mon, Dec 6, 2021 at 3:45 AM Tetsuo Handa
> <penguin-kernel@...ove.sakura.ne.jp> wrote:
>>
>> Linus suspected that "struct tty_ldisc"->ops->write_wakeup() must not
>> sleep, and Jiri confirmed it from include/linux/tty_ldisc.h. Thus, defer
>> n_hdlc_send_frames() from n_hdlc_tty_wakeup() to a WQ context like
>> net/nfc/nci/uart.c does.
>
> Thanks, this looks good to me.
>
> That said, I think there's pretty much the *exact* same pattern in
>
> drivers/net/caif/caif_serial.c:
> write_wakeup() causes "handle_tx()", which then calls tty->ops->write().
>
> drivers/net/hamradio/mkiss.c
> mkiss_write_wakeup() -> tty->ops->write()
>
> drivers/tty/n_gsm.c:
> gsmld_write_wakeup -> gsm_data_kick() -> gsmld_output ->
> gsm->tty->ops->write()
>
> so this does seem to be a common bug pattern for code that has never
> really seen a lot of testing.
Indeed.
>
> The core tty stuff seems to get it right, but maybe I missed something
> in my quick "grep and look for patterns".
handle_tx() in caif_serial.c has a line
/* skb_peek is safe because handle_tx is called after skb_queue_tail */
and I think that this comment is true only when handle_tx() is called from
"struct net_device_ops"->ndo_start_xmit (== caif_xmit()). If handle_tx() is
called from "struct tty_ldisc_ops"->write_wakeup (== ldisc_tx_wakeup()),
handle_tx() might be called before skb_queue_tail() is called?
>
> So I think this patch is good, but I do wonder if perhaps we should
> move the "work_struct" into the tty layer itself, and do the whole
> "schedule_work()" at that level.
I don't know about net_device_ops, but from synchronization point of view,
ser = tty->disc_data;
BUG_ON(ser == NULL);
WARN_ON(ser->tty != tty);
in ldisc_tx_wakeup() makes me feel uneasy, and I can't expect that ldisc_tx_wakeup()
will do safe synchronization by moving the "work_struct" into the tty layer itself.
That is, I think we somehow need to fix caif_serial.c after all.
>
> Some code never wants it (most notably the regular n_tty one), but at
> least n_tty doesn't really care, I suspect. n_tty is using
> write_wakeup() literally just for fasync handling, so I suspect it's
> not exactly going to be performance-critical.
>
> Of course, maybe the fix is to just fix caif_serial/mkiss and n_gsm.
> Or mark them broken - does anybody use them?
I think that fixing individual driver sounds safer choice.
Powered by blists - more mailing lists