netdev - Re: [PATCH net-next] vxlan: distribute vxlan tunneled traffic across multiple TXQs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 01 Jan 2014 21:56:46 -0800
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	David Miller <davem@...emloft.net>
Cc:	stephen@...workplumber.org, sathya.perla@...lex.com,
	netdev@...r.kernel.org, edumazet@...gle.com
Subject: Re: [PATCH net-next] vxlan: distribute vxlan tunneled traffic
 across multiple TXQs

On Tue, 2013-12-31 at 13:56 -0500, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@...il.com>
> Date: Tue, 24 Dec 2013 10:39:06 -0800
> 
> > On Mon, 2013-12-23 at 11:28 -0800, Stephen Hemminger wrote:
> > 
> >> The idea is good, but without the destructor there is nothing to keep
> >> the UDP socket from being destroyed while packet is being sent on another
> >> CPU.
> > 
> > I see no requirement of holding a reference on the vxlan UDP socket in
> > transmit path.
> 
> I'm trying to figure out how leaving a dangling socket attached to
> skb->sk, as in this original patch, can be OK.

Sorry, I lost track here (vacation time...), what patch are you
referring to ?

> 
> If skb->sk is there, anyone can reference it, and meanwhile anyone can
> destroy and free it.
> 
> That's Stephens' objection.

Not really. Stephen told us he copied code from L2TP.
(But IPIP, GRE tunnels do not do that..._

He thought it was needed to hold a reference on vxlan socket, while it
is not needed for any valid reason. RCU locking is more than enough to
be able to build the encapsulation.

> 
> Are you saying that we have something that allows this to be valid?

Once we pass the tunnel, we can either :

1) Leave skb->sk set to the original socket sk1 (Say TCP or UDP
producer)

2) Assign skb->sk to the 'socket' used in vxlan (after orphaning and
releasing reference on socket sk1)

  Current vxlan code chose 2), but it makes no sense because :

We use skb->sk to 

A) control amount of bytes/packets queued on behalf a socket, but
current vxlan code does the skb->sk transfert without any limit/control
on vxlan socket sk_sndbuf.

B) security puposes (as selinux) or netfilter uses, and I do not think
anything is prepared to handle vxlan stacked case in this area.

If we chose 1), it makes more sense, because each producer will be
effectively limited by the proper sk->sk_sndbuf limit.

And a socket cannot be destroyed anyway as long as at least one skb is
in flight (with skb->sk set to this socket)

Really, I do not think vxlan should behave in a different way than other
tunnels (GRE, IPIP, SIT, ...), which chose 1) 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html