lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 17 Nov 2018 09:55:31 +0100
From:   Dominique Martinet <asmadeus@...ewreck.org>
To:     syzbot <syzbot+edec7868af5997928fe9@...kaller.appspotmail.com>
Cc:     davem@...emloft.net, ericvh@...il.com,
        linux-kernel@...r.kernel.org, lucho@...kov.net,
        netdev@...r.kernel.org, syzkaller-bugs@...glegroups.com,
        v9fs-developer@...ts.sourceforge.net
Subject: Re: WARNING: refcount bug in p9_req_put

syzbot wrote on Thu, Nov 15, 2018:
> RIP: 0010:refcount_sub_and_test_checked+0x2c9/0x310 lib/refcount.c:187
> Code: 89 de e8 ea 1a ed fd 84 db 74 07 31 db e9 4d ff ff ff e8 0a 1a
> ed fd 48 c7 c7 20 ae 60 88 c6 05 7b fd 7e 06 01 e8 67 7d b6 fd <0f>
> 0b 31 db e9 2c ff ff ff 48 89 cf e8 a6 67 30 fe e9 41 fe ff ff
> RSP: 0018:ffff88817e87f330 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc90005e51000
> RDX: 00000000000222c2 RSI: ffffffff8165e7e5 RDI: 0000000000000005
> RBP: ffff88817e87f418 R08: ffff8881866ba640 R09: ffffed103b5c5020
> R10: ffffed103b5c5020 R11: ffff8881dae28107 R12: ffff88817c7a7008
> R13: 00000000ffffffff R14: ffff88817e87f3f0 R15: ffff8881c1dc9d68
>  refcount_dec_and_test_checked+0x1a/0x20 lib/refcount.c:212
>  kref_put include/linux/kref.h:69 [inline]
>  p9_req_put+0x20/0x60 net/9p/client.c:395
>  p9_conn_destroy net/9p/trans_fd.c:880 [inline]
>  p9_fd_close+0x39f/0x6b0 net/9p/trans_fd.c:913
>  p9_client_create+0xbd0/0x1674 net/9p/client.c:1062
>  v9fs_session_init+0x217/0x1bb0 fs/9p/v9fs.c:421

So the latest ref put I added on destroy for m->rreq looks like it's not
always a good idea... The worker thread is supposed to be stoped at this
point so I'm not sure what went wrong, but while looking at this I found
a race with the read work function and the cancelled callback --
although that would cause a list corruption so it's not what happened
here.

Will think on it a bit more, and try to reproduce by adding some random
delays to make some races more likely.
I'll fix the race in trans_fd's cancelled after/together with the
async flush patches, I've got something working except for cancelled on
this end so will probably submit something around next week after I've
had time to test a bit more extensively

-- 
Dominique

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ