[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3bbc16ba-953c-a6b6-c5f3-4deaeaa25d10@huawei.com>
Date: Tue, 16 Jul 2019 10:00:37 +0800
From: Xiaoming Ni <nixiaoming@...wei.com>
To: Vasily Averin <vvs@...tuozzo.com>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>
CC: "adobriyan@...il.com" <adobriyan@...il.com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"anna.schumaker@...app.com" <anna.schumaker@...app.com>,
"arjan@...ux.intel.com" <arjan@...ux.intel.com>,
"bfields@...ldses.org" <bfields@...ldses.org>,
"chuck.lever@...cle.com" <chuck.lever@...cle.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"jlayton@...nel.org" <jlayton@...nel.org>,
"luto@...nel.org" <luto@...nel.org>,
"mingo@...nel.org" <mingo@...nel.org>,
"Nadia.Derbey@...l.net" <Nadia.Derbey@...l.net>,
"paulmck@...ux.vnet.ibm.com" <paulmck@...ux.vnet.ibm.com>,
"semen.protsenko@...aro.org" <semen.protsenko@...aro.org>,
"stable@...nel.org" <stable@...nel.org>,
"stern@...land.harvard.edu" <stern@...land.harvard.edu>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"torvalds@...ux-foundation.org" <torvalds@...ux-foundation.org>,
"trond.myklebust@...merspace.com" <trond.myklebust@...merspace.com>,
"viresh.kumar@...aro.org" <viresh.kumar@...aro.org>,
"Huangjianhui (Alex)" <alex.huangjianhui@...wei.com>,
Dailei <dylix.dailei@...wei.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-nfs@...r.kernel.org" <linux-nfs@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH v3 0/3] kernel/notifier.c: avoid duplicate registration
On 2019/7/15 13:38, Vasily Averin wrote:
> On 7/14/19 5:45 AM, Xiaoming Ni wrote:
>> On 2019/7/12 22:07, gregkh@...uxfoundation.org wrote:
>>> On Fri, Jul 12, 2019 at 09:11:57PM +0800, Xiaoming Ni wrote:
>>>> On 2019/7/11 21:57, Vasily Averin wrote:
>>>>> On 7/11/19 4:55 AM, Nixiaoming wrote:
>>>>>> On Wed, July 10, 2019 1:49 PM Vasily Averin wrote:
>>>>>>> On 7/10/19 6:09 AM, Xiaoming Ni wrote:
>>>>>>>> Registering the same notifier to a hook repeatedly can cause the hook
>>>>>>>> list to form a ring or lose other members of the list.
>>>>>>>
>>>>>>> I think is not enough to _prevent_ 2nd register attempt,
>>>>>>> it's enough to detect just attempt and generate warning to mark host in bad state.
>>>>>>>
>>>>>>
>>>>>> Duplicate registration is prevented in my patch, not just "mark host in bad state"
>>>>>>
>>>>>> Duplicate registration is checked and exited in notifier_chain_cond_register()
>>>>>>
>>>>>> Duplicate registration was checked in notifier_chain_register() but only
>>>>>> the alarm was triggered without exiting. added by commit 831246570d34692e
>>>>>> ("kernel/notifier.c: double register detection")
>>>>>>
>>>>>> My patch is like a combination of 831246570d34692e and notifier_chain_cond_register(),
>>>>>> which triggers an alarm and exits when a duplicate registration is detected.
>>>>>>
>>>>>>> Unexpected 2nd register of the same hook most likely will lead to 2nd unregister,
>>>>>>> and it can lead to host crash in any time:
>>>>>>> you can unregister notifier on first attempt it can be too early, it can be still in use.
>>>>>>> on the other hand you can never call 2nd unregister at all.
>>>>>>
>>>>>> Since the member was not added to the linked list at the time of the second registration,
>>>>>> no linked list ring was formed.
>>>>>> The member is released on the first unregistration and -ENOENT on the second unregistration.
>>>>>> After patching, the fault has been alleviated
>>>>>
>>>>> You are wrong here.
>>>>> 2nd notifier's registration is a pure bug, this should never happen.
>>>>> If you know the way to reproduce this situation -- you need to fix it.
>>>>>
>>>>> 2nd registration can happen in 2 cases:
>>>>> 1) missed rollback, when someone forget to call unregister after successfull registration,
>>>>> and then tried to call register again. It can lead to crash for example when according module will be unloaded.
>>>>> 2) some subsystem is registered twice, for example from different namespaces.
>>>>> in this case unregister called during sybsystem cleanup in first namespace will incorrectly remove notifier used
>>>>> in second namespace, it also can lead to unexpacted behaviour.
>>>>>
>>>> So in these two cases, is it more reasonable to trigger BUG() directly when checking for duplicate registration ?
>>>> But why does current notifier_chain_register() just trigger WARN() without exiting ?
>>>> notifier_chain_cond_register() direct exit without triggering WARN() ?
>>>
>>> It should recover from this, if it can be detected. The main point is
>>> that not all apis have to be this "robust" when used within the kernel
>>> as we do allow for the callers to know what they are doing :)
>>>
>> In the notifier_chain_register(), the condition ( (*nl) == n) is the same registration of the same hook.
>> We can intercept this situation and avoid forming a linked list ring to make the API more rob
>
> Once again -- yes, you CAN prevent list corruption, but you CANNOT recover the host and return it back to safe state.
> If double register event was detected -- it means you have bug in kernel.
>
> Yes, you can add BUG here and crash the host immediately, but I prefer to use warning in such situations.
>
>>> If this does not cause any additional problems or slow downs, it's
>>> probably fine to add.
>>>
>> Notifier_chain_register() is not a system hotspot function.
>> At the same time, there is already a WARN_ONCE judgment. There is no new judgment in the new patch.
>> It only changes the processing under the condition of (*nl) == n, which will not cause performance problems.
>> At the same time, avoiding the formation of a link ring can make the system more robust.
>
> I disagree,
> yes, node will have correct list, but anyway node will work wrong and can crash the host in any time.
Sorry, my description is not accurate.
My patch feature does not prevent users from repeatedly registering hooks.
But avoiding the chain ring caused by the user repeatedly registering the hook
There are no modules for duplicate registration hooks in the current system.
But considering that not all modules are in the kernel source tree,
In order to improve the robustness of the kernel API, we should avoid the linked list ring caused by repeated registration.
Or in order to improve the efficiency of problem location, when the duplicate registration is checked, the system crashes directly.
On the other hand, the difference between notifier_chain_register() and notifier_chain_cond_register() for duplicate registrations is confusing:
Blocking the formation of the linked list ring in notifier_chain_cond_register()
There is no interception of the linked list ring in notifier_chain_register(), just an alarm.
Give me the illusion: Isn't notifier_chain_register() allowed to create a linked list ring?
Thanks
xiaoming Ni
Powered by blists - more mailing lists