lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHC9VhR9a1xEB3gXUkb4KRVkwXUAo-701ZumN2OTOmJ7r5ez8g@mail.gmail.com>
Date:   Tue, 28 Jan 2020 10:45:05 -0500
From:   Paul Moore <paul@...l-moore.com>
To:     Ondrej Mosnacek <omosnace@...hat.com>
Cc:     Stephen Smalley <sds@...ho.nsa.gov>,
        Jeff Vander Stoep <jeffv@...gle.com>,
        Eric Paris <eparis@...isplace.org>,
        Jann Horn <jannh@...gle.com>,
        Kees Cook <keescook@...omium.org>,
        Linux kernel mailing list <linux-kernel@...r.kernel.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        SElinux list <selinux@...r.kernel.org>,
        syzkaller-bugs@...glegroups.com,
        syzbot <syzbot+61cba5033e2072d61806@...kaller.appspotmail.com>
Subject: Re: possible deadlock in sidtab_sid2str_put

On Tue, Jan 28, 2020 at 9:27 AM Ondrej Mosnacek <omosnace@...hat.com> wrote:
> On Tue, Jan 28, 2020 at 2:44 PM Stephen Smalley <sds@...ho.nsa.gov> wrote:
> > On 1/28/20 8:39 AM, Paul Moore wrote:
> > > On Tue, Jan 28, 2020 at 7:50 AM syzbot
> > > <syzbot+61cba5033e2072d61806@...kaller.appspotmail.com> wrote:
> > >>
> > >> syzbot has found a reproducer for the following crash on:
> > >>
> > >> HEAD commit:    b0be0eff Merge tag 'x86-pti-2020-01-28' of git://git.kerne..
> > >> git tree:       upstream
> > >> console output: https://syzkaller.appspot.com/x/log.txt?x=1432aebee00000
> > >> kernel config:  https://syzkaller.appspot.com/x/.config?x=9784e57c96a92f20
> > >> dashboard link: https://syzkaller.appspot.com/bug?extid=61cba5033e2072d61806
> > >> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > >> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=10088e95e00000
> > >> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=13fa605ee00000
> > >>
> > >> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > >> Reported-by: syzbot+61cba5033e2072d61806@...kaller.appspotmail.com
> > >>
> > >> =====================================================
> > >> WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
> > >> 5.5.0-syzkaller #0 Not tainted
> > >> -----------------------------------------------------
> > >> syz-executor305/10624 [HC0[0]:SC0[2]:HE1:SE0] is trying to acquire:
> > >> ffff888098c14098 (&(&s->cache_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline]
> > >> ffff888098c14098 (&(&s->cache_lock)->rlock){+.+.}, at: sidtab_sid2str_put.part.0+0x36/0x880 security/selinux/ss/sidtab.c:533
> > >>
> > >> and this task is already holding:
> > >> ffffffff89865770 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline]
> > >> ffffffff89865770 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: nf_conntrack_lock+0x17/0x70 net/netfilter/nf_conntrack_core.c:91
> > >> which would create a new lock dependency:
> > >>   (&(&nf_conntrack_locks[i])->rlock){+.-.} -> (&(&s->cache_lock)->rlock){+.+.}
> > >>
> > >> but this new dependency connects a SOFTIRQ-irq-safe lock:
> > >>   (&(&nf_conntrack_locks[i])->rlock){+.-.}
> > >>
> > >> ... which became SOFTIRQ-irq-safe at:
> > >>    lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484
> > >>    __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
> > >>    _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
> > >>    spin_lock include/linux/spinlock.h:338 [inline]
> > >>    nf_conntrack_lock+0x17/0x70 net/netfilter/nf_conntrack_core.c:91
> > >
> > > ...
> > >
> > >> to a SOFTIRQ-irq-unsafe lock:
> > >>   (&(&s->cache_lock)->rlock){+.+.}
> > >>
> > >> ... which became SOFTIRQ-irq-unsafe at:
> > >> ...
> > >>    lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484
> > >>    __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
> > >>    _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
> > >>    spin_lock include/linux/spinlock.h:338 [inline]
> > >>    sidtab_sid2str_put.part.0+0x36/0x880 security/selinux/ss/sidtab.c:533
> > >>    sidtab_sid2str_put+0xa0/0xc0 security/selinux/ss/sidtab.c:566
> > >>    sidtab_entry_to_string security/selinux/ss/services.c:1279 [inline]
> > >>    sidtab_entry_to_string+0xf2/0x110 security/selinux/ss/services.c:1266
> > >>    security_sid_to_context_core+0x2c6/0x3c0 security/selinux/ss/services.c:1361
> > >>    security_sid_to_context+0x34/0x40 security/selinux/ss/services.c:1384
> > >>    avc_audit_post_callback+0x102/0x790 security/selinux/avc.c:709
> > >>    common_lsm_audit+0x5ac/0x1e00 security/lsm_audit.c:466
> > >>    slow_avc_audit+0x16a/0x1f0 security/selinux/avc.c:782
> > >>    avc_audit security/selinux/include/avc.h:140 [inline]
> > >>    avc_has_perm+0x543/0x610 security/selinux/avc.c:1185
> > >>    inode_has_perm+0x1a8/0x230 security/selinux/hooks.c:1631
> > >>    selinux_mmap_file+0x10a/0x1d0 security/selinux/hooks.c:3701
> > >>    security_mmap_file+0xa4/0x1e0 security/security.c:1482
> > >>    vm_mmap_pgoff+0xf0/0x230 mm/util.c:502
> > >
> > > ...
> > >
> > >> other info that might help us debug this:
> > >>
> > >>   Possible interrupt unsafe locking scenario:
> > >>
> > >>         CPU0                    CPU1
> > >>         ----                    ----
> > >>    lock(&(&s->cache_lock)->rlock);
> > >>                                 local_irq_disable();
> > >>                                 lock(&(&nf_conntrack_locks[i])->rlock);
> > >>                                 lock(&(&s->cache_lock)->rlock);
> > >>    <Interrupt>
> > >>      lock(&(&nf_conntrack_locks[i])->rlock);
> > >>
> > >>   *** DEADLOCK ***
> > >>
> > >> 4 locks held by syz-executor305/10624:
> > >>   #0: ffffffff8c1acc68 (&table[i].mutex){+.+.}, at: nfnl_lock net/netfilter/nfnetlink.c:62 [inline]
> > >>   #0: ffffffff8c1acc68 (&table[i].mutex){+.+.}, at: nfnetlink_rcv_msg+0x9ee/0xfb0 net/netfilter/nfnetlink.c:224
> > >>   #1: ffff8880836415d8 (nlk_cb_mutex-NETFILTER){+.+.}, at: netlink_dump+0xe7/0xfb0 net/netlink/af_netlink.c:2199
> > >>   #2: ffffffff89865770 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline]
> > >>   #2: ffffffff89865770 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: nf_conntrack_lock+0x17/0x70 net/netfilter/nf_conntrack_core.c:91
> > >>   #3: ffffffff8b7df008 (&selinux_ss.policy_rwlock){.+.?}, at: security_sid_to_context_core+0x1ca/0x3c0 security/selinux/ss/services.c:1344
> > >
> > > I think this is going to be tricky to fix due to the differing
> > > contexts from which sidtab_sid2str_put() may be called.  We already
> > > have a check for !in_task() in sidtab_sid2str_put(), do we want to add
> > > a check for !in_serving_softirq() too?
> >
> > No, we should just use spin_lock_irqsave/unlock_irqrestore() IMHO, but
> > that then means we need to re-evaluate the performance gain of this change.
>
> I just tested a patch that switches to the IRQ-disabling locking under
> the systemd-journald scenario from [1] and the impact seems not that
> bad - the security_secid_to_secctx() function only takes up 3.18% of
> total run time vs. ~2% I reported in the original patch
> (d97bd23c2d7d). I'll post the patch shortly.
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1733259, i.e.:
>     cat /dev/urandom | base64 | logger &
>     timeout 30s perf record -p $(pidof systemd-journald) -a -g

I'm skeptical of the real world impact of disabling IRQs here,
although I'm not sure off the top of my head how we could manage a
cache (I'm thinking more about adding to, and evicting entries, from
the cache) without some form of locking.  Perhaps we look into per-cpu
caching?

-- 
paul moore
www.paul-moore.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ