linux-kernel - Re: [PATCH 1/2] fcntl: fix potential deadlocks for &fown

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <47d655dba51cd7d3cf3011f30f42bf7120c656db.camel@kernel.org>
Date:   Fri, 02 Jul 2021 10:27:56 -0400
From:   Jeff Layton <jlayton@...nel.org>
To:     Desmond Cheong Zhi Xi <desmondcheongzx@...il.com>,
        bfields@...ldses.org, viro@...iv.linux.org.uk
Cc:     linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        skhan@...uxfoundation.org, gregkh@...uxfoundation.org,
        linux-kernel-mentees@...ts.linuxfoundation.org,
        syzbot+e6d5398a02c516ce5e70@...kaller.appspotmail.com
Subject: Re: [PATCH 1/2] fcntl: fix potential deadlocks for &fown_struct.lock

On Fri, 2021-07-02 at 21:55 +0800, Desmond Cheong Zhi Xi wrote:
> On 2/7/21 7:44 pm, Jeff Layton wrote:
> > On Fri, 2021-07-02 at 17:18 +0800, Desmond Cheong Zhi Xi wrote:
> > > Syzbot reports a potential deadlock in do_fcntl:
> > > 
> > > ========================================================
> > > WARNING: possible irq lock inversion dependency detected
> > > 5.12.0-syzkaller #0 Not tainted
> > > --------------------------------------------------------
> > > syz-executor132/8391 just changed the state of lock:
> > > ffff888015967bf8 (&f->f_owner.lock){.+..}-{2:2}, at: f_getown_ex fs/fcntl.c:211 [inline]
> > > ffff888015967bf8 (&f->f_owner.lock){.+..}-{2:2}, at: do_fcntl+0x8b4/0x1200 fs/fcntl.c:395
> > > but this lock was taken by another, HARDIRQ-safe lock in the past:
> > >   (&dev->event_lock){-...}-{2:2}
> > > 
> > > and interrupts could create inverse lock ordering between them.
> > > 
> > > other info that might help us debug this:
> > > Chain exists of:
> > >    &dev->event_lock --> &new->fa_lock --> &f->f_owner.lock
> > > 
> > >   Possible interrupt unsafe locking scenario:
> > > 
> > >         CPU0                    CPU1
> > >         ----                    ----
> > >    lock(&f->f_owner.lock);
> > >                                 local_irq_disable();
> > >                                 lock(&dev->event_lock);
> > >                                 lock(&new->fa_lock);
> > >    <Interrupt>
> > >      lock(&dev->event_lock);
> > > 
> > >   *** DEADLOCK ***
> > > 
> > > This happens because there is a lock hierarchy of
> > > &dev->event_lock --> &new->fa_lock --> &f->f_owner.lock
> > > from the following call chain:
> > > 
> > >    input_inject_event():
> > >      spin_lock_irqsave(&dev->event_lock,...);
> > >      input_handle_event():
> > >        input_pass_values():
> > >          input_to_handler():
> > >            evdev_events():
> > >              evdev_pass_values():
> > >                spin_lock(&client->buffer_lock);
> > >                __pass_event():
> > >                  kill_fasync():
> > >                    kill_fasync_rcu():
> > >                      read_lock(&fa->fa_lock);
> > >                      send_sigio():
> > >                        read_lock_irqsave(&fown->lock,...);
> > > 
> > > However, since &dev->event_lock is HARDIRQ-safe, interrupts have to be
> > > disabled while grabbing &f->f_owner.lock, otherwise we invert the lock
> > > hierarchy.
> > > 
> > > Hence, we replace calls to read_lock/read_unlock on &f->f_owner.lock,
> > > with read_lock_irq/read_unlock_irq.
> > > 
> > 
> > Patches look reasonable overall, but why does this one use read_lock_irq
> > and the other one use read_lock_irqsave? Don't we need to *_irqsasve in
> > both patches?
> > 
> > 
> 
> My thinking was that the functions f_getown_ex and f_getowner_uids are 
> only called from do_fcntl, and f_getown is only called from do_fnctl and 
> sock_ioctl. do_fnctl itself is only called from syscalls.
> 
> For sock_ioctl, the chain is
>    compat_sock_ioctl():
>      compat_sock_ioctl_trans():
>        sock_ioctl()
> 
> For both paths, it doesn't seem that interrupts are disabled, so I used 
> the *irq variants.
> 
> But of course, I might be very mistaken on this, and I'd be happy to 
> make the change to *_irqsave.
> 
> Also, on further inspection, if these calls should be changed to 
> *_irqsave, then I believe the call to write_lock_irq in f_modown (called 
> from do_fcntl() --> f_setown() --> __f_setown() --> f_modown()) should 
> also be changed to *_irqsave.
> 
> There's also a call to write_lock_irq(&fa->fa_lock) in 
> fasync_remove_entry and fasync_insert_entry. Whether these should be 
> changed as well isn't as clear to me, but since it's safe to do, perhaps 
> it makes sense to use *_irqsave for them too. Thoughts?
> 


I think your reasoning is probably valid here and we don't need to
save/restore. It wasn't obvious to me until you pointed it out though.
It might be worth a comment, or maybe even this at the top of both
functions:

    WARN_ON_ONCE(irqs_disabled());

I'll pick these into linux-next soon and plan to merge them for v5.15.
Let me know if you think they need to go in sooner.


> > > Reported-and-tested-by: syzbot+e6d5398a02c516ce5e70@...kaller.appspotmail.com
> > > Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@...il.com>
> > > ---
> > >   fs/fcntl.c | 13 +++++++------
> > >   1 file changed, 7 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/fs/fcntl.c b/fs/fcntl.c
> > > index dfc72f15be7f..cf9e81dfa615 100644
> > > --- a/fs/fcntl.c
> > > +++ b/fs/fcntl.c
> > > @@ -150,7 +150,8 @@ void f_delown(struct file *filp)
> > >   pid_t f_getown(struct file *filp)
> > >   {
> > >   	pid_t pid = 0;
> > > -	read_lock(&filp->f_owner.lock);
> > > +
> > > +	read_lock_irq(&filp->f_owner.lock);
> > >   	rcu_read_lock();
> > >   	if (pid_task(filp->f_owner.pid, filp->f_owner.pid_type)) {
> > >   		pid = pid_vnr(filp->f_owner.pid);
> > > @@ -158,7 +159,7 @@ pid_t f_getown(struct file *filp)
> > >   			pid = -pid;
> > >   	}
> > >   	rcu_read_unlock();
> > > -	read_unlock(&filp->f_owner.lock);
> > > +	read_unlock_irq(&filp->f_owner.lock);
> > >   	return pid;
> > >   }
> > >   
> > > @@ -208,7 +209,7 @@ static int f_getown_ex(struct file *filp, unsigned long arg)
> > >   	struct f_owner_ex owner = {};
> > >   	int ret = 0;
> > >   
> > > -	read_lock(&filp->f_owner.lock);
> > > +	read_lock_irq(&filp->f_owner.lock);
> > >   	rcu_read_lock();
> > >   	if (pid_task(filp->f_owner.pid, filp->f_owner.pid_type))
> > >   		owner.pid = pid_vnr(filp->f_owner.pid);
> > > @@ -231,7 +232,7 @@ static int f_getown_ex(struct file *filp, unsigned long arg)
> > >   		ret = -EINVAL;
> > >   		break;
> > >   	}
> > > -	read_unlock(&filp->f_owner.lock);
> > > +	read_unlock_irq(&filp->f_owner.lock);
> > >   
> > >   	if (!ret) {
> > >   		ret = copy_to_user(owner_p, &owner, sizeof(owner));
> > > @@ -249,10 +250,10 @@ static int f_getowner_uids(struct file *filp, unsigned long arg)
> > >   	uid_t src[2];
> > >   	int err;
> > >   
> > > -	read_lock(&filp->f_owner.lock);
> > > +	read_lock_irq(&filp->f_owner.lock);
> > >   	src[0] = from_kuid(user_ns, filp->f_owner.uid);
> > >   	src[1] = from_kuid(user_ns, filp->f_owner.euid);
> > > -	read_unlock(&filp->f_owner.lock);
> > > +	read_unlock_irq(&filp->f_owner.lock);
> > >   
> > >   	err  = put_user(src[0], &dst[0]);
> > >   	err |= put_user(src[1], &dst[1]);
> > 
> 

-- 
Jeff Layton <jlayton@...nel.org>