[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJPywTJv=pFK2dFcHRsZPR89DQVbQX8J6OAcSkZk5MkOP43kvQ@mail.gmail.com>
Date: Wed, 27 Nov 2019 18:15:21 +0100
From: Marek Majkowski <marek@...udflare.com>
To: Maciej Żenczykowski <maze@...gle.com>
Cc: Eric Dumazet <edumazet@...gle.com>,
Neal Cardwell <ncardwell@...gle.com>,
network dev <netdev@...r.kernel.org>,
kernel-team <kernel-team@...udflare.com>
Subject: Re: Delayed source port allocation for connected UDP sockets
There may be a valid socket underneath. Consider socket() followed by bind():
udp UNCONN *:* 0.0.0.0:1703 -> master
udp UNCONN *:* 192.0.2.1:1703 -> worker
Them after connect() is done, the socket will move to ESTAB:
udp UNCONN *:* 0.0.0.0:1703 -> master
udp ESTAB 198.18.0.1:58910 192.0.2.1:1703 -> worker
I want to avoid this race. For this brief moment now I have two UNCONN
sockets. I don't want that. I want other sources to be routed to the
wildcard address. I', thinking that IP_BIND_ADDRESS_NO_PORT should be
basically a request for delayed binding. For me it makes sense to
delay the actual binding to the connect().
Marek
On Wed, Nov 27, 2019 at 5:19 PM Maciej Żenczykowski <maze@...gle.com> wrote:
>
> On Wed, Nov 27, 2019 at 8:09 AM Maciej Żenczykowski <maze@...gle.com> wrote:
> >
> > On Wed, Nov 27, 2019 at 6:08 AM Marek Majkowski <marek@...udflare.com> wrote:
> > >
> > > Morning,
> > >
> > > In my applications I need something like a connectx()[1] syscall. On
> > > Linux I can get quite far with using bind-before-connect and
> > > IP_BIND_ADDRESS_NO_PORT. One corner case is missing though.
> > >
> > > For various UDP applications I'm establishing connected sockets from
> > > specific 2-tuple. This is working fine with bind-before-connect, but
> > > in UDP it creates a slight race condition. It's possible the socket
> > > will receive packet from arbitrary source after bind():
> > >
> > > s = socket(SOCK_DGRAM)
> > > s.bind((192.0.2.1, 1703))
> > > # here be dragons
> > > s.connect((198.18.0.1, 58910))
> > >
> > > For the short amount of time after bind() and before connect(), the
> > > socket may receive packets from any peer. For situations when I don't
> > > need to specify source port, IP_BIND_ADDRESS_NO_PORT flag solves the
> > > issue. This code is fine:
> > >
> > > s = socket(SOCK_DGRAM)
> > > s.setsockopt(IP_BIND_ADDRESS_NO_PORT)
> > > s.bind((192.0.2.1, 0))
> > > s.connect((198.18.0.1, 58910))
> > >
> > > But the IP_BIND_ADDRESS_NO_PORT doesn't work when the source port is
> > > selected. It seems natural to expand the scope of
> > > IP_BIND_ADDRESS_NO_PORT flag. Perhaps this could be made to work:
> > >
> > > s = socket(SOCK_DGRAM)
> > > s.setsockopt(IP_BIND_ADDRESS_NO_PORT)
> > > s.bind((192.0.2.1, 1703))
> > > s.connect((198.18.0.1, 58910))
> > >
> > > I would like such code to delay the binding to port 1703 up until the
> > > connect(). IP_BIND_ADDRESS_NO_PORT only makes sense for connected
> > > sockets anyway. This raises a couple of questions though:
> > >
> > > - IP_BIND_ADDRESS_NO_PORT name is confusing - we specify the port
> > > number in the bind!
> > >
> > > - Where to store the source port in __inet_bind. Neither
> > > inet->inet_sport nor inet->inet_num seem like correct places to store
> > > the user-passed source port hint. The alternative is to introduce
> > > yet-another field onto inet_sock struct, but that is wasteful.
> > >
> > > Suggestions?
> > >
> > > Marek
> > >
> > > [1] https://www.unix.com/man-page/mojave/2/connectx/
> >
> > attack BPF socket filter drop all, then bind, then connect, then replace it.
>
> Although I guess perhaps you'd consider dropping the packets to be bad...?
> Then I think you might be able to do the same trick with
> SO_BINDTODEVICE("dummy0") instead of bpf and then SO_BINDTODEVICE("")
> That unfortunately requires privs though.
Powered by blists - more mailing lists