[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1432850150.114391.56.camel@redhat.com>
Date: Thu, 28 May 2015 17:55:50 -0400
From: Doug Ledford <dledford@...hat.com>
To: Or Gerlitz <gerlitz.or@...il.com>
Cc: Jason Gunthorpe <jgunthorpe@...idianresearch.com>,
Or Gerlitz <ogerlitz@...lanox.com>,
Haggai Eran <haggaie@...lanox.com>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
Linux Netdev List <netdev@...r.kernel.org>,
Liran Liss <liranl@...lanox.com>,
Guy Shapiro <guysh@...lanox.com>,
Shachar Raindel <raindel@...lanox.com>,
Yotam Kenneth <yotamke@...lanox.com>
Subject: Re: [PATCH v4 for-next 00/12] Add network namespace support in the
RDMA-CM
On Thu, 2015-05-28 at 22:05 +0300, Or Gerlitz wrote:
> On Thu, May 28, 2015 at 9:22 PM, Doug Ledford <dledford@...hat.com> wrote:
>
> >> I don't think that is what Doug said.
>
> > Indeed. There is no need to scrap things, but if the design as it
> > stands, and the intended means of creating objects for use in
> > containers, is going to result in an unworkable network, then we have to
> > re-evaluate how the container constructs are created, and that then has
> > possible consequences for how we would get from an incoming packet to
> > the proper container.
>
> To be precise, do we agree that the issue here isn't "in the design as
> it stands" but rather in a problem we found in the intended way of
> assigning IP addresses through DHCP for the containers?
No, I would say the problem *is* in the design. But the problem is the
selected means of identifying the netdev to get to the namespace (and
the proposed means of creating non-default namespace devices to exist in
the container), not the namespace design itself.
> > I'm not trying to stop the "support train" here, but at the same time,
> > if the train is headed for a bridge that's out....
>
> So what's your concrete saying here? where should we go from here?
This excerpt is from the commit log of patch 3/12:
The IB device and port, together with the P_Key and the IP address should
be enough to uniquely identify the ULP net device.
The problem here is that this is wrong. If we allow more than one
device per pkey with the same GUID, then DHCP breaks, which is bad in
and of itself, but it also breaks ipv6 link local addressing. Which
means that this hunk in patch 4/12:
+#if IS_ENABLED(CONFIG_IPV6)
+ case AF_INET6:
+ if (ipv6_chk_addr(net, &addr_in6->sin6_addr, dev, 1))
+ return true;
+
+ break;
+#endif
can now be tricked into returning true for incorrect devices.
Where do we go from here?
First, I'm inclined to say we should modify the add_child portion of
IPoIB to refuse to add links to a PKey if that GUID is already present
on that PKey. You could then use different PKeys on the default GUID
for separate namespaces. If you need separate namespaces on the same
PKey, then enable alias GUIDs for use on the local adapter and require
one GUID per namespace on the same PKey.
Then I'm inclined to say that we should map for namespaces using device,
port, guid/gid, pkey. And in this situation, since a unique guid/gid on
any given pkey maps to a unique dhcp identifier and a unique ipv6
lladdr, this becomes freely interchangeable with device, port, pkey,
address mappings that this patchset was built around.
--
Doug Ledford <dledford@...hat.com>
GPG KeyID: 0E572FDD
Download attachment "signature.asc" of type "application/pgp-signature" (820 bytes)
Powered by blists - more mailing lists