[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f174108c-67c5-3bb6-d558-7e02de701ee2@gmail.com>
Date: Tue, 5 Apr 2022 08:32:52 -0600
From: David Ahern <dsahern@...il.com>
To: Stephen Suryaputra <ssuryaextr@...il.com>,
Ben Greear <greearb@...delatech.com>
Cc: netdev@...r.kernel.org
Subject: Re: Matching unbound sockets for VRF
On 4/4/22 6:41 AM, Stephen Suryaputra wrote:
> On Sun, Apr 03, 2022 at 10:24:36AM -0600, David Ahern wrote:
>> On 3/27/22 6:57 AM, Stephen Suryaputra wrote:
>>>
>>> The reproducer script is attached.
>>>
>>
>> h0 has the mgmt vrf, the l3mdev settings yet is running the client in
>> *default* vrf. Add 'ip vrf exec mgmt' before the 'nc' and it works.
>
> Yes. With "ip vrf exec mgmt" nc would work. We know that. See more
> below.
>
>> Are you saying that before Mike and Robert's changes you could get a
>> client to run in default VRF and work over mgmt VRF? If so it required
>> some ugly routing tricks (the last fib rule you installed) and is a bug
>> relative to the VRF design.
>
> Yes, before Mike and Robert's changes the client ran fine because of the
> last fib rule. We did that because some of our applications are:
> 1) Pre-dates "ip vrf exec"
> 2) LD_PRELOAD trick from the early days doesn't work
>
> On the case (2) above, one concrete example is NFS mounting our images:
> applications and kernel modules. We had to run less than full-blown
> utilities and also the mount command uses glibc RPC functions
> (pmap_getmaps(), clntudp_create(), clnt_call(), etc, etc.). We analyzed
> it back then that because these functions are in glibc and call socket()
> from within glibc, the LD_PRELOAD doesn't work.
>
> From the thread of Mike and Robert's changes, the conclusion is that the
> previous behavior is a bug but we have been relying on it for a while,
> since the early days of VRFs, and an upgrade that includes the changes
> caused some applications to not work anymore.
>
> I'm asking if Mike and Robert's changes should be controlled by an
> option, e.g. sysctl, and be the default. But can be reverted back to the
> previous behavior.
>
It has been 3-1/2 years since that patch. Rather than add more checks to
try to manage unintended app behavior, why not work on making your apps
consistent with the intent of the VRF design? If adding `ip vrf exec
VRF` before commands works, that is a very simple solution and the
reason for the command (handle code that is not VRF aware).
I'm guessing that option will not work for all cases (e.g., NFS which I
think Ben has asked about as well, cc'ed), but working towards making
the code align with VRF design is the longer term win.
Powered by blists - more mailing lists