[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240725140821.kfzzn6uh52acwcbm@quack3>
Date: Thu, 25 Jul 2024 16:08:21 +0200
From: Jan Kara <jack@...e.cz>
To: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jan Kara <jack@...e.cz>, Christian Brauner <brauner@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Jiri Slaby <jirislaby@...nel.org>,
Alexander Viro <viro@...iv.linux.org.uk>,
LKML <linux-kernel@...r.kernel.org>,
linux-serial <linux-serial@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [PATCH] tty: tty_io: fix race between tty_fops and
hung_up_tty_fops
On Tue 23-07-24 07:20:34, Tetsuo Handa wrote:
> On 2024/07/23 1:24, Greg Kroah-Hartman wrote:
> > On Mon, Jul 22, 2024 at 06:10:41PM +0200, Jan Kara wrote:
> >> On Mon 22-07-24 16:41:22, Christian Brauner wrote:
> >>> On Fri, Jul 19, 2024 at 10:37:47PM GMT, Tetsuo Handa wrote:
> >>>> syzbot is reporting data race between __tty_hangup() and __fput(), and
> >>>> Dmitry Vyukov mentioned that this race has possibility of NULL pointer
> >>>> dereference, for tty_fops implements e.g. splice_read callback whereas
> >>>> hung_up_tty_fops does not.
> >>>>
> >>>> CPU0 CPU1
> >>>> ---- ----
> >>>> do_splice_read() {
> >>>> __tty_hangup() {
> >>>> // f_op->splice_read was copy_splice_read
> >>>> if (unlikely(!in->f_op->splice_read))
> >>>> return warn_unsupported(in, "read");
> >>>> filp->f_op = &hung_up_tty_fops;
> >>>> // f_op->splice_read is now NULL
> >>>> return in->f_op->splice_read(in, ppos, pipe, len, flags);
> >>>> }
> >>>> }
> >>>>
> >>>> Fix possibility of NULL pointer dereference by implementing missing
> >>>> callbacks, and suppress KCSAN messages by adding __data_racy qualifier
> >>>> to "struct file"->f_op .
> >>>
> >>> This f_op replacing without synchronization seems really iffy imho.
> >>
> >> Yeah, when I saw this I was also going "ouch". I was just waiting whether a
> >> tty maintainer will comment ;)
> >
> > I really didn't want to :)
> >
> >> Anyway this replacement of ops in file /
> >> inode has proven problematic in almost every single case where it was used
> >> leading to subtle issues.
> >
> > Yeah, let's not do this.
>
> https://lkml.kernel.org/r/18a58415-4aa9-4cba-97d2-b70384407313@I-love.SAKURA.ne.jp was a patch
> that does not replace f_op, and
> https://lkml.kernel.org/r/CAHk-=wgSOa_g+bxjNi+HQpC=6sHK2yKeoW-xOhb0-FVGMTDWjg@mail.gmail.com
> was a comment from Linus.
OK, thanks for references. After doing some light audit of tty, I didn't
find a way how switching f_op could break the system. Still it is giving
me some creeps because usually there's some god-forgotten place somewhere
that caches some decision based on f_op content and when f_op change,
things go out of sync with strange results. And the splice operations
enabled by tty are complex enough to hide some potential problems.
In fact I'm not sure why tty switches f_op at all. The reliable check for
tty being hung up seems to be fetching ldisc pointer under a ldisc_lock and
most operations do this and handle it appropriately -> no need for special
f_op for hung up state for them. .ioctl notably might need some love but
overall it doesn't seem too hard to completely avoid changing f_op for tty.
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists