lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <72dfecd426d183615c0dd4c2e68690b0e95dd739.camel@infradead.org>
Date:   Mon, 28 Jun 2021 12:23:21 +0100
From:   David Woodhouse <dwmw2@...radead.org>
To:     Jason Wang <jasowang@...hat.com>, netdev@...r.kernel.org
Cc:     Eugenio Pérez <eperezma@...hat.com>,
        Willem de Bruijn <willemb@...gle.com>
Subject: Re: [PATCH v3 3/5] vhost_net: remove virtio_net_hdr validation, let
 tun/tap do it themselves

On Mon, 2021-06-28 at 12:23 +0800, Jason Wang wrote:
> 在 2021/6/25 下午4:37, David Woodhouse 写道:
> > On Fri, 2021-06-25 at 15:33 +0800, Jason Wang wrote:
> > > 在 2021/6/24 下午8:30, David Woodhouse 写道:
> > > > From: David Woodhouse<dwmw@...zon.co.uk>
> > > > 
> > > > When the underlying socket isn't configured with a virtio_net_hdr, the
> > > > existing code in vhost_net_build_xdp() would attempt to validate
> > > > uninitialised data, by copying zero bytes (sock_hlen) into the local
> > > > copy of the header and then trying to validate that.
> > > > 
> > > > Fixing it is somewhat non-trivial because the tun device might put a
> > > > struct tun_pi*before*  the virtio_net_hdr, which makes it hard to find.
> > > > So just stop messing with someone else's data in vhost_net_build_xdp(),
> > > > and let tap and tun validate it for themselves, as they do in the
> > > > non-XDP case anyway.
> > > 
> > > Thinking in another way. All XDP stuffs for vhost is prepared for TAP.
> > > XDP is not expected to work for TUN.
> > > 
> > > So we can simply let's vhost doesn't go with XDP path is the underlayer
> > > socket is TUN.
> > 
> > Actually, IFF_TUN mode per se isn't that complex. It's fixed purely on
> > the tun side by that first patch I posted, which I later expanded a
> > little to factor out tun_skb_set_protocol().
> > 
> > The next two patches in my original set were fixing up the fact that
> > XDP currently assumes that the *socket* will be doing the vhdr, not
> > vhost. Those two weren't tun-specific at all.
> > 
> > It's supporting the PI header (which tun puts *before* the virtio
> > header as I just said) which introduces a tiny bit more complexity.
> 
> 
> This reminds me we need to fix tun_put_user_xdp(),

Good point; thanks.

> but as we've discussed, we need first figure out if PI is worth to
> support for vhost-net.

FWIW I certainly don't care about PI support. The only time anyone
would want PI support is if they need to support protocols *other* than
IPv6 and Legacy IP, over tun mode.

I'm fixing this stuff because when I tried to use vhost-tun + tun for
*sensible* use cases, I ended up having to flounder around trying to
find a combination of settings that actually worked. And that offended
me :)

So I wrote a test case to iterate over various possible combinations of
settings, and then kept typing until that all worked.

The only thing I do feel quite strongly about is that stuff should
either *work*, or *explicitly* fail if it's unsupported.

At this point, although I have no actual use for it myself, I'd
probably just about come down on the side of supporting PI. On the
basis that:

 • I've basically made it work already.

 • It allows those code paths like tun_skb_set_protocol() to be
   consolidated as both calling code paths need the same thing.

 • Even in the kernel, and even when modules are as incestuously
   intertwined as vhost-net and tun already are, I'm a strong
   believer in *not* making assumptions about someone else's data,
   so letting *tun* handle its own headers without making those
   assumptions seems like the right thing to do.



If we want to support PI, I need to go fix tun_put_user_xdp() as you
noted (and work out how to add that to the test case). And resolve the
fact that configuration might change after tun_get_socket() is called —
and indeed that there might not *be* a configuration at all when
tun_get_socket() is called.


If we *don't* want to support PI, well, the *interesting* part of the
above needs fixing anyway. Because I strongly believe we should
*prevent* it if we don't support it, and we *still* have the case you
point out of the tun vhdr_size being changed at runtime.

I'll take a look at whether can pass the socklen back from tun to
vhost-net on *every* packet. Is there a MSG_XXX flag we can abuse and
somewhere in the msghdr that could return the header length used for
*this* packet? Or could we make vhost_net_rx_peek_head_len() call
explicitly into the tun device instead of making stuff up in
peek_head_len()? 


To be clear: from the point of view of my *application* I don't care
about any of this; my only motivation here is to clean up the kernel
behaviour and make life easier for potential future users. I have found
a setup that works in today's kernels (even though I have to disable
XDP, and have to use a virtio header that I don't want), and will stick
with that for now, if I actually commit it to my master branch at all:
https://gitlab.com/openconnect/openconnect/-/commit/0da4fe43b886403e6

I might yet abandon it because I haven't *yet* seen it go any faster
than the code which just does read()/write() on the tun device from
userspace. And without XDP or zerocopy it's not clear that it could
ever give me any benefit that I couldn't achieve purely in userspace by
having a separate thread to do tun device I/O. But we'll see...

Download attachment "smime.p7s" of type "application/x-pkcs7-signature" (5174 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ