lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 31 May 2023 11:52:06 +0200
From:   Roberto Sassu <roberto.sassu@...weicloud.com>
To:     Paul Moore <paul@...l-moore.com>,
        syzbot <syzbot+8fb64a61fdd96b50f3b8@...kaller.appspotmail.com>
Cc:     hdanton@...a.com, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org, reiserfs-devel@...r.kernel.org,
        roberto.sassu@...wei.com, syzkaller-bugs@...glegroups.com,
        peterz@...radead.org, mingo@...hat.com, will@...nel.org,
        Jeff Mahoney <jeffm@...e.com>, Jan Kara <jack@...e.cz>
Subject: Re: [syzbot] [reiserfs?] possible deadlock in open_xa_dir

On 5/31/2023 11:49 AM, Roberto Sassu wrote:
> On 5/5/2023 11:36 PM, Paul Moore wrote:
>> On Fri, May 5, 2023 at 4:51 PM syzbot
>> <syzbot+8fb64a61fdd96b50f3b8@...kaller.appspotmail.com> wrote:
>>>
>>> syzbot has bisected this issue to:
>>>
>>> commit d82dcd9e21b77d338dc4875f3d4111f0db314a7c
>>> Author: Roberto Sassu <roberto.sassu@...wei.com>
>>> Date:   Fri Mar 31 12:32:18 2023 +0000
>>>
>>>      reiserfs: Add security prefix to xattr name in 
>>> reiserfs_security_write()
>>>
>>> bisection log:  
>>> https://syzkaller.appspot.com/x/bisect.txt?x=14403182280000
>>> start commit:   3c4aa4434377 Merge tag 'ceph-for-6.4-rc1' of 
>>> https://githu..
>>> git tree:       upstream
>>> final oops:     
>>> https://syzkaller.appspot.com/x/report.txt?x=16403182280000
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=12403182280000
>>> kernel config:  
>>> https://syzkaller.appspot.com/x/.config?x=73a06f6ef2d5b492
>>> dashboard link: 
>>> https://syzkaller.appspot.com/bug?extid=8fb64a61fdd96b50f3b8
>>> syz repro:      
>>> https://syzkaller.appspot.com/x/repro.syz?x=12442414280000
>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=176a7318280000
>>>
>>> Reported-by: syzbot+8fb64a61fdd96b50f3b8@...kaller.appspotmail.com
>>> Fixes: d82dcd9e21b7 ("reiserfs: Add security prefix to xattr name in 
>>> reiserfs_security_write()")
>>>
>>> For information about bisection process see: 
>>> https://goo.gl/tpsmEJ#bisection
>>
>> I don't think Roberto's patch identified above is the actual root
>> cause of this problem as reiserfs_xattr_set_handle() is called in
>> reiserfs_security_write() both before and after the patch.  However,
>> due to some bad logic in reiserfs_security_write() which Roberto
>> corrected, I'm thinking that it is possible this code is being
>> exercised for the first time and syzbot is starting to trigger a
>> locking issue in the reiserfs code ... ?
> 
> + Jan, Jeff (which basically restructured the lock)

Actually adding Jan and Jeff.

Roberto

> + Petr, Ingo, Will
> 
> I involve the lockdep experts, to get a bit of help on this.
> 
> First of all, the lockdep warning is trivial to reproduce:
> 
> # dd if=/dev/zero of=reiserfs.img bs=1M count=100
> # losetup -f --show reiserfs.img
> /dev/loop0
> # mkfs.reiserfs /dev/loop0
> # mount /dev/loop0 /mnt/
> # touch file0
> 
> In the testing system, Smack is the major LSM.
> 
> Ok, so the warning here is clear:
> 
> https://syzkaller.appspot.com/x/log.txt?x=12403182280000
> 
> However, I was looking if that can really happen. From this:
> 
> [   77.746561][ T5418] -> #1 (&sbi->lock){+.+.}-{3:3}:
> [   77.753772][ T5418]        lock_acquire+0x23e/0x630
> [   77.758792][ T5418]        __mutex_lock_common+0x1d8/0x2530
> [   77.764504][ T5418]        mutex_lock_nested+0x1b/0x20
> [   77.769868][ T5418]        reiserfs_write_lock+0x70/0xc0
> [   77.775321][ T5418]        reiserfs_mkdir+0x321/0x870
> 
> I see that the lock is taken in reiserfs_write_lock(), while lockdep says:
> 
> [   77.710227][ T5418] but task is already holding lock:
> [   77.717587][ T5418] ffff88807568d090 (&sbi->lock){+.+.}-{3:3}, at: 
> reiserfs_write_lock_nested+0x4a/0xb0
> 
> which is in a different place, I believe here:
> 
> int reiserfs_paste_into_item(struct reiserfs_transaction_handle *th,
>                               /* Path to the pasted item. */
> [...]
> 
>          depth = reiserfs_write_unlock_nested(sb);
>          dquot_free_space_nodirty(inode, pasted_size);
>          reiserfs_write_lock_nested(sb, depth);
>          return retval;
> }
> 
> This is called by reiserfs_add_entry(), which is called by 
> reiserfs_create() (it is in the lockdep trace). After returning to 
> reiserfs_create(), d_instantiate_new() is called.
> 
> I don't know exactly, I take the part that the lock is held. But if it 
> is held, how d_instantiate_new() can be executed in another task?
> 
> static int reiserfs_create(struct mnt_idmap *idmap, struct inode *dir,
>                          struct dentry *dentry, umode_t mode, bool excl)
> {
> 
> [...]
> 
>          reiserfs_write_lock(dir->i_sb);
> 
>          retval = journal_begin(&th, dir->i_sb, jbegin_count);
> 
> [...]
> 
>          d_instantiate_new(dentry, inode);
>          retval = journal_end(&th);
> 
> out_failed:
>          reiserfs_write_unlock(dir->i_sb);
> 
> If the lock is held, the scenario lockdep describes cannot happen. Any 
> thoughts?
> 
> Thanks
> 
> Roberto

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ