lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180910063739.GX23674@gauss3.secunet.de>
Date:   Mon, 10 Sep 2018 08:37:39 +0200
From:   Steffen Klassert <steffen.klassert@...unet.com>
To:     Wolfgang Walter <linux@...m.de>
CC:     <netdev@...r.kernel.org>, Wei Wang <weiwan@...gle.com>,
        Tobias Hommel <netdev-list@...oetigt.de>,
        Eric Dumazet <edumazet@...gle.com>
Subject: Re: kernels > v4.12 oops/crash with ipsec-traffic: bisected to
 b838d5e1c5b6e57b10ec8af2268824041e3ea911: ipv4: mark DST_NOGC and remove the
 operation of dst_free()

On Fri, Sep 07, 2018 at 11:10:55PM +0200, Wolfgang Walter wrote:
> Hello Steffen,
> 
> in one of your emails to Thomas you wrote:
> > xfrm_lookup+0x2a is at the very beginning of xfrm_lookup(), here we
> > find:
> > 
> > u16 family = dst_orig->ops->family;
> > 
> > ops has an offset of 32 bytes (20 hex) in dst_orig, so looks like
> > dst_orig is NULL.
> > 
> > In the forwarding case, we get dst_orig from the skb and dst_orig
> > can't be NULL here unless the skb itself is already fishy.
> 
> Is this really true?
> 
> If xfrm_lookup is called from 
> 
> __xfrm_route_forward():
> 
> int __xfrm_route_forward(struct sk_buff *skb, unsigned short family)
> {
>         struct net *net = dev_net(skb->dev);
>         struct flowi fl;
>         struct dst_entry *dst;
>         int res = 1;
> 
>         if (xfrm_decode_session(skb, &fl, family) < 0) {
>                 XFRM_INC_STATS(net, LINUX_MIB_XFRMFWDHDRERROR);
>                 return 0;
>         }
> 
>         skb_dst_force(skb);
> 
>         dst = xfrm_lookup(net, skb_dst(skb), &fl, NULL, XFRM_LOOKUP_QUEUE);
>         if (IS_ERR(dst)) {
>                 res = 0;
>                 dst = NULL;
>         }
>         skb_dst_set(skb, dst);
>         return res;
> }
> 
> couldn't it be possible that skb_dst_force(skb) actually sets dst to NULL if 
> it cannot safely lock it? If it is absolutely sure that skb_dst_force() never 
> can set dst to NULL I wonder why it is called at all?

Ugh, skb_dst_force apparently changed since I looked at it last time.
I did not expect that it can clear skb->dst. This behaviour was
introduced with:

commit 222d7dbd258dad4cd5241c43ef818141fad5a87a
net: prevent dst uses after free

from Eric Dumazet (put him to Cc).

The easy fix that could be backported to stable would be
to check skb->dst for NULL and drop the packet in that case.

I wonder if we can do better here. We can still use the
dst_entry as long as we don't exit the RCU grace period.
But looking deeper into it, the crypto layer might return
asynchronously. In this case, we exit the RCU grace period
and we have to drop the packet anyway.

If I understand correct, the bug happens rarely. So maybe
we could just stay with the easy fix (I'll do a patch today).

The other thing I wonder about is why Tobias bisected this to

commit b838d5e1c5b6e57b10ec8af2268824041e3ea911
ipv4: mark DST_NOGC and remove the operation of dst_free()

from 'Jun 17 2017' and not to

commit 222d7dbd258dad4cd5241c43ef818141fad5a87a
net: prevent dst uses after free

from 'Sep 21 2017'.

Maybe Tobias has seen two bugs. Before
("net: prevent dst uses after free"), it was the
use after free, and after this fix it was a NULL
pointer derference of skb->dst.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ