lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87mw2jg25a.fsf@x220.int.ebiederm.org>
Date:	Tue, 07 Apr 2015 14:38:09 -0500
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Vivek Venkatraman <vivek@...ulusnetworks.com>
Cc:	roopa <roopa@...ulusnetworks.com>,
	Andy Gospodarek <gospo@...ulusnetworks.com>,
	Stephen Hemminger <shemming@...cade.com>,
	"netdev\@vger.kernel.org" <netdev@...r.kernel.org>,
	Robert Shearman <rshearma@...cade.com>
Subject: Re: [PATCH net-next 6/8] iproute2: Add support for the RTA_VIA attribute

Vivek Venkatraman <vivek@...ulusnetworks.com> writes:

> At the edge, when doing IPoMPLS, we'll be imposing a set of labels on
> top of the packet rather than replacing, but the same semantics can be
> applied because the destination address is now different and becomes a
> label stack.

Exactly how this will happen is an open question.  The hard part is we
need something light weight enough that we can scale to 1 million
routes, aka a full routing table. 

Network devices consume much too much memory to contemplate having a
different network device for each of 1 million different routes.

The transform infrastructure (xfrm) that is used for ipsec looks
attractive for imposing tunnels but it is clumsy, and does not map well
to the kinds of tunnels IPoMPLS traffic needs.

Having something in the ipv4 and ipv6 fib entry say a pointer or a 32bit
key that refers to a struct mpls_route to impose looks like what we want
int he abstract.  What the userspace interface for that implemenation is
something that I do not see clearly.  Ideally we build a userspace
interface that works not only for MPLS but also for other tunnel types
like IPIP, GRE, etc.   This would allow not only MPLS tunnels but other
tunnel types to be supported up to the full routing table size.

Perhaps a new attribute RTA_ENCAP that encodes a structure with
a tunnel type and enough information to encode the tunnel header.
I would have to make a survey of the existing tunnel types to see
if there is enough of a pattern an option that works for multiple
protocols could actually be achieved.

Using a tunnel that is not a network device and as such does not need
to keep packet counters looks like it will scale much better than our
other options, even with the best memory usage simplications I can
imagine for network devices.  Maintenance of per cpu counters (which are
necessary for performance) requires a non-trivial amount of memory and
as such are much harder to scale.

> One thing to note is that the destination address replaced/imposed
> could change based on the path selected, when there is ECMP. So, I
> propose that the iproute2 syntax of "as [to]" be reconsidered for
> MPLS, otherwise we'll end up with something like the following when
> this is extended to setup IPoMPLS direct forwarding with ECMP:
>
> ip route add 147.1.1.0/24 nexthop as to 400/2230 via inet 192.168.1.1
> dev eth0 nexthop as to 600/2400 via inet 192.168.2.1 dev eth1

That does not work with the semantics of the RTA_NEWDST message require
the new address to be in the same address family as the old address.
So it is useful for NATing IPv4 or IPv6 with routes (if you are
so inclined) but it is not useful for imposing an mpls header.

> Instead, if we use the specifier "label", we'll get:
>
> ip route add 147.1.1.0/24 nexthop via inet 192.168.1.1 dev eth0 label
> 400/2230 nexthop via inet 192.168.2.1 dev eth1 label 600/2400
>
> The transit case (label swapping) would look like:
>
> ip -f mpls route add 400 via inet 192.168.1.10 dev eth0 label 500
>
> The syntax can then be better extended to specify a label operation
> such as "pop" which would be needed when performing ultimate hop pop
> (UHP) and then lookup/forward based on underlying label stack or IP
> header.

Pop is the case where where the RTA_NEWDST attribute is empty (or
unspecified).

>From an mpls perspective the RTA_DST label is always popped (if it
matches) and the RTA_NEWDST label stack is always pushed.

> A new application besides MPLS that needs to modify the destination
> address would use its own keyword but encode using the RTA_NEWDST
> attribute.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ