[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e425ef6d-f94c-731b-817a-6c743b777733@virtuozzo.com>
Date: Thu, 20 Sep 2018 12:04:22 +0300
From: Kirill Tkhai <ktkhai@...tuozzo.com>
To: Cong Wang <xiyou.wangcong@...il.com>
Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: [Patch net-next] ipv4: initialize ra_mutex in inet_init_net()
On 20.09.2018 0:28, Cong Wang wrote:
> On Wed, Sep 19, 2018 at 1:25 AM Kirill Tkhai <ktkhai@...tuozzo.com> wrote:
>>
>> On 18.09.2018 23:17, Cong Wang wrote:
>>> On Mon, Sep 17, 2018 at 12:25 AM Kirill Tkhai <ktkhai@...tuozzo.com> wrote:
>>>> In inet_init() the order of registration is:
>>>>
>>>> ip_mr_init();
>>>> init_inet_pernet_ops();
>>>>
>>>> This means, ipmr_net_ops pernet operations are before af_inet_ops
>>>> in pernet_list. So, there is a theoretical probability, sometimes
>>>> in the future, we will have a problem during a fail of net initialization.
>>>>
>>>> Say,
>>>>
>>>> setup_net():
>>>> ipmr_net_ops->init() returns 0
>>>> xxx->init() returns error
>>>> and then we do:
>>>> ipmr_net_ops->exit(),
>>>>
>>>> which could touch ra_mutex (theoretically).
>>>
>>> How could ra_mutex be touched in this scenario?
>>>
>>> ra_mutex is only used in ip_ra_control() which is called
>>> only by {get,set}sockopt(). I don't see anything related
>>> to netns exit() path here.
>>
>> Currently, it is not touched. But it's an ordinary practice,
>> someone closes sockets in pernet ->exit methods. For example,
>> we close percpu icmp sockets in icmp_sk_exit(), which are
>> also of RAW type, and there is also called ip_ra_control()
>> for them. Yes, they differ by their protocol; icmp sockets
>> are of IPPROTO_ICMP protocol, while ip_ra_control() acts
>> on IPPROTO_RAW sockets, but it's not good anyway. This does
>> not look reliable for the future. In case of someone changes
>> something here, we may do not notice this for the long time,
>> while some users will meet bugs on their places.
>
> First of all, we only consider current code base. Even if you
> really planned to changed this in the future, it would be still your
> responsibility to take care of it. Why? Simple, FIFO. My patch
> comes ahead of any future changes here, obviously.
>
> Secondly, it is certainly not hard to notice. I am pretty sure
> you would get a warning/crash if this bug triggered.
>
>
>
>>
>> Problems on error paths is not easy to detect on testing,
>> while user may meet them. We had issue of same type with
>> uninitialized xfrm_policy_lock. It was introduced in 2013,
>> while the problem was found only in 2017:
>
> Been there, done that, I've fixed multiple IPv6 init code
> error path. This is not where we disagree, by the way.
>
>
>>
>> introduced by 283bc9f35bbb
>> fixed by c282222a45cb
>>
>> (Last week I met it on RH7 kernel, which still has no a fix.
>> But this talk is not about distribution kernels, just about
>> the way).
>>
>> I just want to say if someone makes some creativity on top
>> of this code, it will be to more friendly from us to him/her
>> to not force this person to think about such not obvious details,
>> but just to implement nice architecture right now.
>
> You keep saying nice architecture, how did you architect
> ipv4.ra_mutex into net/core/net_namespace.c? It is apparently
> not nice.
>
> Good luck with your future.
>
> I am tired, let's just drop it. I have no interest to fix your mess.
You added me to CC, so you probably want to know my opinion about this.
Since it's not a real problem fix, but just a refactoring, I say you
my opinion, how this refactoring may be made better. If you don't want
to know my opinion, you may consider not to CC me.
Just this, not a reason to take offense.
Thanks,
Kirill
Powered by blists - more mailing lists