netdev - Re: crash in __xfrm_state

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20191213102512.GP795@breakpoint.cc>
Date:   Fri, 13 Dec 2019 11:25:12 +0100
From:   Florian Westphal <fw@...len.de>
To:     Steffen Klassert <steffen.klassert@...unet.com>
Cc:     Josh Hunt <johunt@...mai.com>, herbert@...dor.apana.org.au,
        David Miller <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>, Florian Westphal <fw@...len.de>
Subject: Re: crash in __xfrm_state_lookup on 4.19 LTS

Steffen Klassert <steffen.klassert@...unet.com> wrote:
> > index f3423562d933..c3d7df1387c8 100644
> > --- a/net/xfrm/xfrm_state.c
> > +++ b/net/xfrm/xfrm_state.c
> > @@ -1730,9 +1730,9 @@ xfrm_state_lookup(struct net *net, u32 mark, const
> > xfrm_address_t *daddr, __be32
> >  {
> >         struct xfrm_state *x;
> > 
> > -       rcu_read_lock();
> > +       spin_lock_bh(&net->xfrm.xfrm_state_lock);
> >         x = __xfrm_state_lookup(net, mark, daddr, spi, proto, family);
> > -       rcu_read_unlock();
> > +       spin_unlock_bh(&net->xfrm.xfrm_state_lock);
> >         return x;
> >  }
> 
> While that could fix it, it adds a global list lock
> to the packet path and reverts:
> 
> commit c2f672fc94642bae96821a393f342edcfa9794a6
> xfrm: state lookup can be lockless
> 
> I've Cced Florian who did that change.
> 
> I thought to do a rcu_read_lock_bh(), but in between I think
> it would make the problem just less likely to occur.
> 
> We destroy the states with a workqueue by doing schedule_work().
> I think we should better use call_rcu to make sure that a
> rcu grace period has elapsed before the states are destroyed.

xfrm_state_gc_task calls synchronize_rcu after stealing the gc list and
before destroying those states, so I don't think this is a problem.