lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87lhbws6pz.fsf@x220.int.ebiederm.org>
Date:	Thu, 24 Sep 2015 00:54:00 -0500
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Thomas Graf <tgraf@...g.ch>
Cc:	Jiri Benc <jbenc@...hat.com>, netdev@...r.kernel.org,
	Roopa Prabhu <roopa@...ulusnetworks.com>
Subject: Re: [PATCH net 0/2] lwtunnel: make it really work, for IPv4

Thomas Graf <tgraf@...g.ch> writes:

> On 09/23/15 at 04:09pm, Eric W. Biederman wrote:
>
> [...]
>
>> *Blink* You were targeting net.git with a feature enhancement????
>> I will just ignore that.
>
> The point of this series is to not expose the src and dst port Netlink
> bits to user space in a released kernel because the ABI is not set in
> stone yet. Hence targeting net.
>
> If patch 1 is regarded unacceptable we should at least pull in patch 2
> to not expose these bits until this has been worked out to leave the
> option proposed here on the table.

My only interest in this is to help figure out how to make IPv6 ndisc
work over light weight tunnels.

>> What I was observing is that in general the only tunneled packets that
>> need an ingress metadata dst for a tunneled medium ethernet like medium
>> are arp and ndisc packets.  In other cases if you aren't doing something
>> exceptional like openvswitch the normal routing should be sufficient.
>> 
>> Which means a ndo_reply_dst method could remove the need in many cases
>> for an ingress metadata dst to need to be allocated.
>
> The tunnel RX metadata collected is used to associate packets matching
> a particular tunnel id with the appropriate virtual networks by forwarding
> them to a separate netns, separate VRF device or a separate bridge.
>
> More sophisticated hypervisors may run multiple tunnel endpoints on
> the same host using different host addresses and differentiate packets
> based on the underlay destination IP as well.

Fair enough.  And in at least some of those situations the dst metadata
will be needed on every packet.  I think the extra allocation per packet
for the metadata dst is unfortunate but I won't say it is wrong.

>> Regardless a netdevice operation that digs into the packet and figures
>> out what is necessary for a reply seems like the clean way to make this
>> work for both arp and neighbour discovery.
>
> I'm not disagreeing entirely although I disagree that you can do the
> NDO without looking at the original metadata dst. Even a full fib
> lookup based on the requested IP in the ARP header is somewhat error
> prone. I fully agree though that once we support additional types
> besides IP tunneling then such an NDO might in fact make sense.

We can't use the metadata dst for IPv6 neighbour discovery.  Neighbour
discovery processing comes after ip6_route_input.  That is what makes
such a network device operation interesting today.

We don't need the information in the metadata dst because the
information that was in the metadata dst is still in the packet we just
need to reparse the packet.

Given that the input network device is per tunnel type, the network
device method will already know the format of the tunnel packet and so
should not have any trouble parsing it.

As an assist we can preserve 90% of the information in ip_tunnel_key by
repurposing inner_transport_header, inner_network_header and
inner_mac_header (which are only valid on output packets today) as
outer_transport_header, outer_network_header and outer_mac_header for
input packets.

That makes tun_id the only field of struct ip_tunnel_key that we have to
work to find.



Creating outer_transport_header, outer_network_header and
outer_mac_header should open up a lot of optmization opportunities
for input tunnel processing.  I expect with just a little bit of care
we should be able to replace the input metadata dst with a handful
of fields stored in skb->cb.  Which in turn means no memory allocations
are necessary, and that the work can be done unconditionally.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ