lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <efccb718a50a91e31807ac93fee58852ef334434.camel@intel.com>
Date:   Thu, 13 Feb 2020 11:13:55 +0000
From:   "Boeuf, Sebastien" <sebastien.boeuf@...el.com>
To:     "sgarzare@...hat.com" <sgarzare@...hat.com>
CC:     "stefanha@...hat.com" <stefanha@...hat.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "davem@...emloft.net" <davem@...emloft.net>
Subject: Re: [PATCH] net: virtio_vsock: Fix race condition between bind and
 listen

On Thu, 2020-02-13 at 12:02 +0100, Stefano Garzarella wrote:
> On Thu, Feb 13, 2020 at 10:44:18AM +0000, Boeuf, Sebastien wrote:
> > On Thu, 2020-02-13 at 11:22 +0100, Stefano Garzarella wrote:
> > > On Thu, Feb 13, 2020 at 09:51:36AM +0000, Boeuf, Sebastien wrote:
> > > > Hi Stefano,
> > > > 
> > > > On Thu, 2020-02-13 at 10:41 +0100, Stefano Garzarella wrote:
> > > > > Hi Sebastien,
> > > > > 
> > > > > On Thu, Feb 13, 2020 at 09:16:11AM +0000, Boeuf, Sebastien
> > > > > wrote:
> > > > > > From 2f1276d02f5a12d85aec5adc11dfe1eab7e160d6 Mon Sep 17
> > > > > > 00:00:00
> > > > > > 2001
> > > > > > From: Sebastien Boeuf <sebastien.boeuf@...el.com>
> > > > > > Date: Thu, 13 Feb 2020 08:50:38 +0100
> > > > > > Subject: [PATCH] net: virtio_vsock: Fix race condition
> > > > > > between
> > > > > > bind
> > > > > > and listen
> > > > > > 
> > > > > > Whenever the vsock backend on the host sends a packet
> > > > > > through
> > > > > > the
> > > > > > RX
> > > > > > queue, it expects an answer on the TX queue. Unfortunately,
> > > > > > there
> > > > > > is one
> > > > > > case where the host side will hang waiting for the answer
> > > > > > and
> > > > > > will
> > > > > > effectively never recover.
> > > > > 
> > > > > Do you have a test case?
> > > > 
> > > > Yes I do. This has been a bug we've been investigating on Kata
> > > > Containers for quite some time now. This was happening when
> > > > using
> > > > Kata
> > > > along with Cloud-Hypervisor (which rely on the hybrid vsock
> > > > implementation from Firecracker). The thing is, this bug is
> > > > very
> > > > hard
> > > > to reproduce and was happening for Kata because of the
> > > > connection
> > > > strategy. The kata-runtime tries to connect a million times
> > > > after
> > > > it
> > > > started the VM, just hoping the kata-agent will start to listen
> > > > from
> > > > the guest side at some point.
> > > 
> > > Maybe is related to something else. I tried the following which
> > > should be
> > > your case simplified (IIUC):
> > > 
> > > guest$ python
> > >     import socket
> > >     s = socket.socket(socket.AF_VSOCK, socket.SOCK_STREAM)
> > >     s.bind((socket.VMADDR_CID_ANY, 1234))
> > > 
> > > host$ python
> > >     import socket
> > >     s = socket.socket(socket.AF_VSOCK, socket.SOCK_STREAM)
> > >     s.connect((3, 1234))
> > > 
> > > Traceback (most recent call last):
> > >   File "<stdin>", line 1, in <module>
> > > TimeoutError: [Errno 110] Connection timed out
> > 
> > Yes this is exactly the simplified case. But that's the point, I
> > don't
> > think the timeout is the best way to go here. Because this means
> > that
> > when we run into this case, the host side will wait for quite some
> > time
> > before retrying, which can cause a very long delay before the
> > communication with the guest is established. By simply answering
> > the
> > host with a RST packet, we inform it that nobody's listening on the
> > guest side yet, therefore the host side will close and try again.
> 
> Yes, make sense.
> 
> I just wanted to point out that the host shouldn't get stuck if the
> guest doesn't respond. So it's weird that this happens and it might
> be
> related to some other problem.

Yes that's because we're using the hybrid vsock implementation from
Firecracker. So basically they proxy a UNIX socket connection into a
VSOCK socket. And the timeout is not implemented. I'll point them out
that's something it'd be nice to have along with this fix in the guest.

> 
> > > > > In the host, the af_vsock.c:vsock_stream_connect() set a
> > > > > timeout,
> > > > > so
> > > > > if
> > > > > the host try to connect before the guest starts listening,
> > > > > the
> > > > > connect()
> > > > > should return ETIMEDOUT if the guest does not answer
> > > > > anything.
> > > > > 
> > > > > Anyway, maybe the patch make sense anyway, changing a bit the
> > > > > description
> > > > > (if the host connect() receive the ETIMEDOUT).
> > > > > I'm just concerned that this code is common between guest and
> > > > > host.
> > > > > If a
> > > > > malicious guest starts sending us wrong requests, we spend
> > > > > time
> > > > > sending
> > > > > a reset packet. But we already do that if we can't find the
> > > > > bound
> > > > > socket,
> > > > > so it might make sense.
> > > > 
> > > > Yes I don't think this is gonna cause more trouble, but at
> > > > least we
> > > > cannot end up in this weird situation I described.
> > > 
> > > Okay, but in the host, we can't trust the guest to answer, we
> > > should
> > > handle this case properly.
> > 
> > Well I cannot agree more with the "we cannot trust the guest"
> > philosophy, but in this case the worst thing that can happen is the
> > guest shooting himself in the foot because it would simply prevent
> > the
> > connection from happening.
> > 
> > And I agree setting up a timeout from the host side is still a good
> > idea for preventing from such DOS attack.
> > 
> > But as I mentioned above, in the normal use case, this allows for
> > better responsiveness when it comes to establish the connection as
> > fast
> > as possible.
> 
> Sure, maybe you can rewrite a bit the commit (title and body) to
> explain
> this.

Of course, let me work on this.
How am I supposed to resend the patch? Should I send a new email with
v2 in the title?

> 
> > > > I was just not sure if the function we should use to do the
> > > > reset
> > > > should be virtio_transport_reset_no_sock() or
> > > > virtio_transport_reset()
> > > > since at this point the socket is already bound.
> > > 
> > > I think you can safely use virtio_transport_reset() in this case.
> > 
> > I've just tried it and unfortunately it doesn't work. I think
> > that's
> > because the connection has not been properly established yet, so we
> > cannot consider being in this case.
> > Using virtio_transport_reset_no_sock() seems like the less
> > intrusive
> > function here.
> 
> Oh sorry, I also put a comment on virtio_transport_reset() to say to
> use it
> only on connected sockets and not listeners.
> In this case it's a listener, sorry for the wrong suggestion.

Hehe no worries, at least we're on the same page :)

Thanks,
Sebastien

> 
> Thanks,
> Stefano
> 
---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris, 
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ