[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d58d43b2-2434-e9db-b1e6-1a1b7b85773f@gmail.com>
Date: Tue, 20 Aug 2019 07:59:22 -0700
From: Mark Balançian <mbalant3@...il.com>
To: sathya.prakash@...adcom.com
Cc: julian.calaby@...il.com, suganath-prabu.subramani@...adcom.com,
MPT-FusionLinux.pdl@...adcom.com, andrianov@...ras.ru,
linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] lsilogic mpt fusion: mptctl: Fixed race condition
around mptctl_id variable using mutexes
Hello Mister Prakash, Calaby, and Subramani,
I also please request your reply to my previous message before the end
of this Thursday the latest, as I am partaking in an evaluation period
from the organization I am working for with a deadline very close to
that time.
Thank you,
Mark
On 2019-08-20 7:46 a.m., Mark Balantzyan wrote:
> Hi all,
>
> The race condition in the mptctl driver I'm wishing to have confirmed
> is evidenced by the pair of call chains:
>
> compat_mpctl_ioctl -> compat_mpt_command -> mptctl_do_mpt_command
> which calls mpt_get_msg_frame(mptctl_id, ioc)
>
> and
>
> __mptctl_ioctl -> mpt_fw_download -> mptctl_do_fw_download which calls
> mpt_put_msg_frame(mptctl_id, iocp, mf) and calls
> mpt_get_msg_frame(mptctl_id, iocp)
>
> Since ioctl can be called at any time, accessing of mptctl_id occurs
> concurrently between threads causing a race.
>
> I realize in past messages I've tried to patch by surrounding all
> instances of mptctl_id with mutexes but I'm focusing this time on one
> clear instance of the race condition involving the variable mptctl_id,
> since Julian asks what the exact race condition is with respect to the
> case.
>
> Please let me know the confirmation or not confirmation of this race
> possibility.
>
> Thank you,
> Mark
>
> On Sun, 18 Aug 2019, Julian Calaby wrote:
>
>> Hi Mark,
>>
>> On Thu, Aug 15, 2019 at 8:02 PM Mark Balantzyan <mbalant3@...il.com>
>> wrote:
>>>
>>> Certain functions in the driver, such as mptctl_do_fw_download() and
>>> mptctl_do_mpt_command(), rely on the instance of mptctl_id, which
>>> does the
>>> id-ing. There is race condition possible when these functions
>>> operate in
>>> concurrency. Via, mutexes, the functions are mutually signalled to
>>> cooperate.
>>>
>>> Changelog v2
>>>
>>> Lacked a version number but added properly terminated else condition at
>>> (former) line 2300.
>>>
>>> Changelog v3
>>>
>>> Fixes "return -EAGAIN" lines which were erroneously tabbed as if to
>>> be guarded
>>> by "if" conditions lying above them.
>>>
>>> Signed-off-by: Mark Balantzyan <mbalant3@...il.com>
>>>
>>> ---
>>
>> Changelog should be down here after the "---"
>>
>>> drivers/message/fusion/mptctl.c | 43 +++++++++++++++++++++++++--------
>>> 1 file changed, 33 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/message/fusion/mptctl.c
>>> b/drivers/message/fusion/mptctl.c
>>> index 4470630d..3270843c 100644
>>> --- a/drivers/message/fusion/mptctl.c
>>> +++ b/drivers/message/fusion/mptctl.c
>>> @@ -816,12 +816,15 @@ mptctl_do_fw_download(int ioc, char __user
>>> *ufwbuf, size_t fwlen)
>>>
>>> /* Valid device. Get a message frame and construct
>>> the FW download message.
>>> */
>>> + mutex_lock(&mpctl_mutex);
>>> if ((mf = mpt_get_msg_frame(mptctl_id, iocp)) == NULL)
>>> - return -EAGAIN;
>>> + mutex_unlock(&mpctl_mutex);
>>> + return -EAGAIN;
>>
>> Are you sure this is right?
>>
>> 1. We're now exiting early with -EAGAIN regardless of the result of
>> mpt_get_msg_frame()
>> 2. If the result of mpt_get_msg_frame() is not NULL, we don't unlock
>> the mutex
>>
>> Do you mean something like:
>>
>> - - - - - -
>>
>> mutex_lock(&mpctl_mutex);
>> mf = mpt_get_msg_frame(mptctl_id, iocp);
>> mutex_unlock(&mpctl_mutex);
>>
>> if (mf == NULL) {
>>
>> - - - - - -
>>
>>> @@ -1889,8 +1894,10 @@ mptctl_do_mpt_command (struct
>>> mpt_ioctl_command karg, void __user *mfPtr)
>>>
>>> /* Get a free request frame and save the message context.
>>> */
>>> + mutex_lock(&mpctl_mutex);
>>> if ((mf = mpt_get_msg_frame(mptctl_id, ioc)) == NULL)
>>> - return -EAGAIN;
>>> + mutex_unlock(&mpctl_mutex);
>>> + return -EAGAIN;
>>
>> Same comments here.
>>
>>> @@ -2563,7 +2576,9 @@ mptctl_hp_hostinfo(unsigned long arg, unsigned
>>> int data_size)
>>> /*
>>> * Gather ISTWI(Industry Standard Two Wire Interface) Data
>>> */
>>> + mutex_lock(&mpctl_mutex);
>>> if ((mf = mpt_get_msg_frame(mptctl_id, ioc)) == NULL) {
>>> + mutex_unlock(&mpctl_mutex);
>>
>> This line needs to be indented to match the line below, also we don't
>> unlock the mutex if mpt_get_msg_frame() is not NULL.
>>
>>> @@ -3010,9 +3027,11 @@ static int __init mptctl_init(void)
>>> * Install our handler
>>> */
>>> ++where;
>>> + mutex_lock(&mpctl_mutex);
>>> mptctl_id = mpt_register(mptctl_reply, MPTCTL_DRIVER,
>>> "mptctl_reply");
>>> if (!mptctl_id || mptctl_id >= MPT_MAX_PROTOCOL_DRIVERS) {
>>> + mutex_unlock(&mpctl_mutex);
>>
>> Why not use a local variable and only update the global variable if
>> it's valid?
>>
>>> printk(KERN_ERR MYNAM ": ERROR: Failed to register
>>> with Fusion MPT base driver\n");
>>> misc_deregister(&mptctl_miscdev);
>>> err = -EBUSY;
>>> @@ -3022,13 +3041,14 @@ static int __init mptctl_init(void)
>>> mptctl_taskmgmt_id = mpt_register(mptctl_taskmgmt_reply,
>>> MPTCTL_DRIVER,
>>> "mptctl_taskmgmt_reply");
>>> if (!mptctl_taskmgmt_id || mptctl_taskmgmt_id >=
>>> MPT_MAX_PROTOCOL_DRIVERS) {
>>> + mutex_unlock(&mpctl_mutex);
>>
>> Same comment here.
>>
>>> @@ -3044,13 +3064,14 @@ out_fail:
>>> /*=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=*/
>>>
>>> static void mptctl_exit(void)
>>> {
>>> + mutex_lock(&mpctl_mutex);
>>> misc_deregister(&mptctl_miscdev);
>>> printk(KERN_INFO MYNAM ": Deregistered /dev/%s @
>>> (major,minor=%d,%d)\n",
>>> mptctl_miscdev.name, MISC_MAJOR,
>>> mptctl_miscdev.minor);
>>>
>>> /* De-register event handler from base module */
>>> mpt_event_deregister(mptctl_id);
>>> -
>>> +
>>
>> Please don't add trailing whitespace.
>>
>> Did you test this on real hardware? I'd expect it to deadlock and
>> crash almost immediately.
>>
>> Also, it might be worthwhile creating accessor functions for the
>> mptctl variables or using atomics, that way the locking doesn't need
>> to be right every time they're used.
>>
>> Finally, what's the exact race condition here? Are the functions you
>> reference changing the values of the mptctl variables while other code
>> is using them? These functions appear to be setup functions, so why
>> are those variables being used before the device is fully set up?
>> Single usages of those variables shouldn't be subject to race
>> conditions, so you shouldn't need mutexes around those.
>>
>> Thanks,
>>
>> --
>> Julian Calaby
>>
>> Email: julian.calaby@...il.com
>> Profile: http://www.google.com/profiles/julian.calaby/
>>
Powered by blists - more mailing lists