[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1535560580.23560.65.camel@arista.com>
Date: Wed, 29 Aug 2018 17:36:20 +0100
From: Dmitry Safonov <dima@...sta.com>
To: Jiri Slaby <jslaby@...e.cz>, linux-kernel@...r.kernel.org
Cc: Daniel Axtens <dja@...ens.net>,
Dmitry Safonov <0x7f454c46@...il.com>,
Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
Dmitry Vyukov <dvyukov@...gle.com>,
Tan Xiaojun <tanxiaojun@...wei.com>,
Peter Hurley <peter@...leysoftware.com>,
Pasi Kärkkäinen <pasik@....fi>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Michael Neuling <mikey@...ling.org>,
Mikulas Patocka <mpatocka@...hat.com>, stable@...r.kernel.org
Subject: Re: [PATCH 2/4] tty: Hold tty_ldisc_lock() during tty_reopen()
On Wed, 2018-08-29 at 16:40 +0200, Jiri Slaby wrote:
> On 08/29/2018, 04:23 AM, Dmitry Safonov wrote:
> > tty_ldisc_reinit() doesn't race with neither tty_ldisc_hangup()
> > nor set_ldisc() nor tty_ldisc_release() as they use tty lock.
> > But it races with anyone who expects line discipline to be the same
> > after hoding read semaphore in tty_ldisc_ref().
> >
> > We've seen the following crash on v4.9.108 stable:
> >
> > BUG: unable to handle kernel paging request at 0000000000002260
> > IP: [..] n_tty_receive_buf_common+0x5f/0x86d
> > Workqueue: events_unbound flush_to_ldisc
> > Call Trace:
> > [..] n_tty_receive_buf2
> > [..] tty_ldisc_receive_buf
> > [..] flush_to_ldisc
> > [..] process_one_work
> > [..] worker_thread
> > [..] kthread
> > [..] ret_from_fork
> >
> > I think, tty_ldisc_reinit() should be called with ldisc_sem hold
> > for
> > writing, which will protect any reader against line discipline
> > changes.
> >
> > Note: I failed to reproduce the described crash, so obiviously
> > can't
> > guarantee that this is the place where line discipline was
> > switched.
> >
> > Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
> > Cc: Jiri Slaby <jslaby@...e.com>
> > Cc: stable@...r.kernel.org
> > Signed-off-by: Dmitry Safonov <dima@...sta.com>
> > ---
> > drivers/tty/tty_io.c | 9 +++++++--
> > 1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> > index 5e5da9acaf0a..3ef8b977b167 100644
> > --- a/drivers/tty/tty_io.c
> > +++ b/drivers/tty/tty_io.c
> > @@ -1267,15 +1267,20 @@ static int tty_reopen(struct tty_struct
> > *tty)
> > if (test_bit(TTY_EXCLUSIVE, &tty->flags) &&
> > !capable(CAP_SYS_ADMIN))
> > return -EBUSY;
> >
> > - tty->count++;
> > + retval = tty_ldisc_lock(tty, 5 * HZ);
>
> Why 5 secs? This would cause random errors on machines under heavy
> load.
Yeah, I think MAX_SCHEDULE_TIMEOUT will make more sense here..
Not sure, why I decided to go with 5*HZ instead.
Will resend with new timeout, if everything else looks good to you.
(having in mind my argument for count++ in 1/4)
>
> > + if (retval)
> > + return retval;
> >
> > + tty->count++;
> > if (tty->ldisc)
> > - return 0;
> > + goto out_unlock;
> >
> > retval = tty_ldisc_reinit(tty, tty->termios.c_line);
> > if (retval)
> > tty->count--;
> >
> > +out_unlock:
> > + tty_ldisc_unlock(tty);
> > return retval;
>
> So what about:
> tty_ldisc_lock(tty, MAX_SCHEDULE_TIMEOUT);
> if (!tty->ldisc)
> ret = tty_ldisc_reinit(tty, tty->termios.c_line);
> tty_ldisc_unlock(tty);
>
> if (!ret)
> tty->count++;
>
> return ret;
>
--
Thanks,
Dmitry
Powered by blists - more mailing lists