[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <82d6e954-8043-078c-266c-2f1ac992f864@virtuozzo.com>
Date: Fri, 16 Mar 2018 21:14:06 +0300
From: Kirill Tkhai <ktkhai@...tuozzo.com>
To: Sowmini Varadhan <sowmini.varadhan@...cle.com>
Cc: santosh.shilimkar@...cle.com, davem@...emloft.net,
netdev@...r.kernel.org, linux-rdma@...r.kernel.org,
rds-devel@....oracle.com, edumazet@...gle.com
Subject: Re: [PATCH RFC RFC] rds: Use NETDEV_UNREGISTER in rds_tcp_dev_event()
(then kill NETDEV_UNREGISTER_FINAL)
On 16.03.2018 20:29, Sowmini Varadhan wrote:
>
> I had taken some of this offline, but it occurs to me
> that some of these notes should be saved to the netdev archives,
> in case this question pops up again in a few years.
>
> When I run your patch, I get a repeatable panic by doing
> modprobe rds-tcp
> ip netns create blue
> the panic is because we are finding a null trn in rds_tcp_init_net.
I did the second version and sent you. Have you tried it?
> I think there's something very disturbed about calling
> register_pernet_operations() twice, once via
> register_pernet_device() and again via register_pernet_subsys().
Calling netdevice handler for every event is more disturbing,
as number of events is several times bigger, than one more
pernet exit method.
> I suspect this has everything to do with the panic but I have
> not had time to debug every little detail here.
You speak in the way I forced you to spend a lot of time debugging
my patches. But I sent you only one with a warn it's not tested.
(but twice -- since the second time I used --patience diff option
to make idea more visible for you). It's strange to hear this from
you.
> In general, rds_tcp is not a network device, it is a kernel
> module. That is the fundamental problem here.
>
> To repeat the comments form net_namespace.h:
> * Network interfaces need to be removed from a dying netns _before_
> * subsys notifiers can be called, as most of the network code cleanup
> * (which is done from subsys notifiers) runs with the assumption that
> * dev_remove_pack has been called so no new packets will arrive during
> * and after the cleanup functions have been called. dev_remove_pack
> * is not per namespace so instead the guarantee of no more packets
> * arriving in a network namespace is provided by ensuring that all
> * network devices and all sockets have left the network namespace
> * before the cleanup methods are called.
>
> when the "blue" netns starts up, it creates at least one kernel listen
> socket on *.16385. This socket, and any other child/client sockets
> created must be cleaned up before the cleanup_net can happen.
>
> This is why I chose to call regster_pernet_subsys. Again, as per
> comments in net_namespace.h:
>
> * Use these carefully. If you implement a network device and it
> * needs per network namespace operations use device pernet operations,
> * otherwise use pernet subsys operations.
This is not a strict rule, and currently I'm working on pernet_operations
synchronization. This commentary is for generic code, while your rds
is differ from the rest of ordinary pernet_operations.
> On (03/16/18 18:51), Kirill Tkhai wrote:
>>> Let's find another approach. Could you tell what problem we have in
>>> case of rds_tcp_dev_ops is declared as pernet_device?
>
> As above, rds-tcp is not a network device.
It's not an agrument. It's not a rule, what you read in the comment.
We need to use rules of reasonableness, and the generic rules do not
fit to your driver. So, it's need to search a solution. Be the only
user NETDEV_UNREGISTER_FINAL in the kernel does not look a good one.
>> One more question. Which time we take a reference of loopback device?
>> Is it possible before we created a net completely?
>
> We dont take a reference on the loopback device.
> We make sure none of the kernel sockets does a get_net() so
> that we dont block the cleanup_net, and then, when all
> the network interfaces have been taken down (loopback is
> the last one) we know there are no more packets coming in
> and out, so it is safe to dismantle all kernel sockets
> created by rds-tcp.
So that, if RDS packets does not act on netdev refcount,
which circular dependence did you mean in your second
message (quote below)?
>As I recall, when I wrote the initial patchset, my problem
>was that in order to let the module unload make progress,
>all these sockets had to be cleaned up. But to clean up these
>sockets, net_device cleanup had to complete (should not have
>any new incoming connections to the listen endpoint on a
>non-loopback socket) so I ended up with a circular dependancy.
Kirill
Powered by blists - more mailing lists