lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sun, 17 Dec 2017 17:33:36 -0500 From: Willem de Bruijn <willemdebruijn.kernel@...il.com> To: Andreas Hartmann <andihartmann@...19freenet.de> Cc: Michal Kubecek <mkubecek@...e.cz>, Jason Wang <jasowang@...hat.com>, David Miller <davem@...emloft.net>, Network Development <netdev@...r.kernel.org> Subject: Re: Linux 4.14 - regression: broken tun/tap / bridge network with virtio - bisected On Fri, Dec 15, 2017 at 1:05 AM, Andreas Hartmann <andihartmann@...19freenet.de> wrote: > On 12/14/2017 at 11:17 PM Willem de Bruijn wrote: >>>> Well, the patch does not fix hanging VMs, which have been shutdown and >>>> can't be killed any more. >>>> Because of the stack trace >>>> >>>> [<ffffffffc0d0e3c5>] vhost_net_ubuf_put_and_wait+0x35/0x60 [vhost_net] >>>> [<ffffffffc0d0f264>] vhost_net_ioctl+0x304/0x870 [vhost_net] >>>> [<ffffffff9b25460f>] do_vfs_ioctl+0x8f/0x5c0 >>>> [<ffffffff9b254bb4>] SyS_ioctl+0x74/0x80 >>>> [<ffffffff9b00365b>] do_syscall_64+0x5b/0x100 >>>> [<ffffffff9b78e7ab>] entry_SYSCALL64_slow_path+0x25/0x25 >>>> [<ffffffffffffffff>] 0xffffffffffffffff >>>> >>>> I was hoping, that the problems could be related - but that seems not to >>>> be true. >>> >>> However, it turned out, that reverting the complete patchset "Remove UDP >>> Fragmentation Offload support" prevent hanging qemu processes. >> >> That implies a combination of UFO and vhost zerocopy. Disabling >> experimental_zcopytx in vhost_net will probably work around the bug >> then. I have been able to reproduce the hang by sending a UFO packet between two guests running v4.13 on a host running v4.15-rc1. The vhost_net_ubuf_ref refcount indeed hits overflow (-1) from vhost_zerocopy_callback being called for each segment of a segmented UFO skb. This refcount is decremented then on each segment, but incremented only once for the entire UFO skb. Before v4.14, these packets would be converted in skb_segment to regular copy packets with skb_orphan_frags and the callback function called once at this point. v4.14 added support for reference counted zerocopy skb that can pass through skb_orphan_frags unmodified and have their zerocopy state safely cloned with skb_zerocopy_clone. The call to skb_zerocopy_clone must come after skb_orphan_frags to limit cloning of this state to those skbs that can do so safely. Please try a host with the following patch. This fixes it for me. I intend to send it to net. diff --git a/net/core/skbuff.c b/net/core/skbuff.c index a592ca025fc4..d2d985418819 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -3654,8 +3654,6 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb, skb_shinfo(nskb)->tx_flags |= skb_shinfo(head_skb)->tx_flags & SKBTX_SHARED_FRAG; - if (skb_zerocopy_clone(nskb, head_skb, GFP_ATOMIC)) - goto err; while (pos < offset + len) { if (i >= nfrags) { @@ -3681,6 +3679,8 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb, if (unlikely(skb_orphan_frags(frag_skb, GFP_ATOMIC))) goto err; + if (skb_zerocopy_clone(nskb, frag_skb, GFP_ATOMIC)) + goto err; *nskb_frag = *frag; __skb_frag_ref(nskb_frag); This is relatively inefficient, as it calls skb_zerocopy_clone for each frag in the frags[] array. I will follow-up with a patch to net-next that only checks once per skb: diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 466581cf4cdc..a293a33604ec 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -3662,7 +3662,8 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb, skb_shinfo(nskb)->tx_flags |= skb_shinfo(head_skb)->tx_flags & SKBTX_SHARED_FRAG; - if (skb_zerocopy_clone(nskb, head_skb, GFP_ATOMIC)) + if (skb_orphan_frags(frag_skb, GFP_ATOMIC) || + skb_zerocopy_clone(nskb, frag_skb, GFP_ATOMIC)) goto err; while (pos < offset + len) { @@ -3676,6 +3677,11 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb, BUG_ON(!nfrags); + if (skb_orphan_frags(frag_skb, GFP_ATOMIC) || + skb_zerocopy_clone(nskb, frag_skb, + GFP_ATOMIC)) + goto err; + list_skb = list_skb->next; } @@ -3687,9 +3693,6 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb, goto err; } - if (unlikely(skb_orphan_frags(frag_skb, GFP_ATOMIC))) - goto err; - I'll also send to net-next (1) a patch to convert its vhost_net_ ubuf_ref refcnt to refcount_t (2) a path to skb_zerocopy_clone to warn on clone if not sock_zerocopy_callback > I already tested it w/ options vhost_net experimental_zcopytx=0 - but > this didn't "resolve" anything. See > https://www.mail-archive.com/netdev@vger.kernel.org/msg203197.html > > Therefore, I think your following thoughts are lapsed unfortunately, > aren't they? That experiment was perhaps run before commit 0c19f846d582 ("net: accept UFO datagrams from tuntap and packet") and hit the other UFO bug.
Powered by blists - more mailing lists