[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2d7c9164.2b1f.1806f2a8ed9.Coremail.linma@zju.edu.cn>
Date: Thu, 28 Apr 2022 15:55:01 +0800 (GMT+08:00)
From: "Lin Ma" <linma@....edu.cn>
To: "Greg KH" <gregkh@...uxfoundation.org>
Cc: "Jakub Kicinski" <kuba@...nel.org>,
"Duoming Zhou" <duoming@....edu.cn>,
krzysztof.kozlowski@...aro.org, pabeni@...hat.com,
linux-kernel@...r.kernel.org, davem@...emloft.net,
alexander.deucher@....com, akpm@...ux-foundation.org,
broonie@...nel.org, netdev@...r.kernel.org
Subject: Re: [PATCH net v4] nfc: ... device_is_registered() is data
race-able
Hello Greg,
>
> You should not be making these types of checks outside of the driver
> core.
>
> > This is by no means matching our expectations as one of our previous patch relies on the device_is_registered code.
>
> Please do not do that.
>
> >
> > -> the patch: 3e3b5dfcd16a ("NFC: reorder the logic in nfc_{un,}register_device")
> >
> <...>
> >
> > In another word, the device_del -> kobject_del -> __kobject_del is not protected by the device_lock.
>
> Nor should it be.
>
I may have mistakenly presented my point. In fact, there is nothing wrong with the device core, nothing to do with the internal of device_del and device_is_registered implementation. And, of course, we will not add any code or do any modification to the device/driver base code.
The point is the combination of device_is_registered + device_del, which is used in NFC core, is not safe.
That is to say, even the device_is_registered can return True even the device_del is executing in another thread.
(By debugging we think this is true, correct me if it is not)
Hence we want to add additional state in nfc_dev object to fix that, not going to add any state in device/driver core.
> > This means the device_lock + device_is_registered is still prone to the data race. And this is not just the problem with firmware downloading. The all relevant netlink tasks that use the device_lock + device_is_registered is possible to be raced.
> >
> > To this end, we will come out with two patches, one for fixing this device_is_registered by using another status variable instead. The other is the patch that reorders the code in nci_unregister_device.
>
> Why is this somehow unique to these devices? Why do no other buses have
> this issue? Are you somehow allowing a code path that should not be
> happening?
>
> thanks,
>
> greg k-h
In fact, by searching the device_is_registered() use cases, I found that most of them are used in drier code instead of in the network stack. I have no idea whether or not they suffer from similar problems and I will check that out.
Thanks
Lin
Powered by blists - more mailing lists