linux-kernel - Re: [v3] nbd: fix potential NULL pointer fault in nbd_genl

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1b1110b2-1db6-9781-89cf-82b1403b1641@huawei.com>
Date:   Wed, 12 Feb 2020 10:00:37 +0800
From:   "sunke (E)" <sunke32@...wei.com>
To:     Mike Christie <mchristi@...hat.com>, <josef@...icpanda.com>,
        <axboe@...nel.dk>
CC:     <linux-block@...r.kernel.org>, <nbd@...er.debian.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [v3] nbd: fix potential NULL pointer fault in nbd_genl_disconnect



在 2020/2/12 0:39, Mike Christie 写道:
> On 02/10/2020 10:12 PM, sunke (E) wrote:
>>
>>
>> 在 2020/2/11 1:05, Mike Christie 写道:
>>> On 02/10/2020 01:32 AM, Sun Ke wrote:
>>>> Open /dev/nbdX first, the config_refs will be 1 and
>>>> the pointers in nbd_device are still null. Disconnect
>>>> /dev/nbdX, then reference a null recv_workq. The
>>>> protection by config_refs in nbd_genl_disconnect is useless.
>>>>
>>>> To fix it, just add a check for a non null task_recv in
>>>> nbd_genl_disconnect.
>>>>
>>>> Signed-off-by: Sun Ke <sunke32@...wei.com>
>>>> ---
>>>> v1 -> v2:
>>>> Add an omitted mutex_unlock.
>>>>
>>>> v2 -> v3:
>>>> Add nbd->config_lock, suggested by Josef.
>>>> ---
>>>>    drivers/block/nbd.c | 8 ++++++++
>>>>    1 file changed, 8 insertions(+)
>>>>
>>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>>>> index b4607dd96185..870b3fd0c101 100644
>>>> --- a/drivers/block/nbd.c
>>>> +++ b/drivers/block/nbd.c
>>>> @@ -2008,12 +2008,20 @@ static int nbd_genl_disconnect(struct sk_buff
>>>> *skb, struct genl_info *info)
>>>>                   index);
>>>>            return -EINVAL;
>>>>        }
>>>> +    mutex_lock(&nbd->config_lock);
>>>>        if (!refcount_inc_not_zero(&nbd->refs)) {
>>>> +        mutex_unlock(&nbd->config_lock);
>>>>            mutex_unlock(&nbd_index_mutex);
>>>>            printk(KERN_ERR "nbd: device at index %d is going down\n",
>>>>                   index);
>>>>            return -EINVAL;
>>>>        }
>>>> +    if (!nbd->recv_workq) {
>>>> +        mutex_unlock(&nbd->config_lock);
>>>> +        mutex_unlock(&nbd_index_mutex);
>>>> +        return -EINVAL;
>>>> +    }
>>>> +    mutex_unlock(&nbd->config_lock);
>>>>        mutex_unlock(&nbd_index_mutex);
>>>>        if (!refcount_inc_not_zero(&nbd->config_refs)) {
>>>>            nbd_put(nbd);
>>>>
>>>
>>> With my other patch then we will not need this right? It handles your
>>> case by just being integrated with the existing checks in:
>>>
>>> nbd_disconnect_and_put->nbd_clear_sock->sock_shutdown
>>>
>>> ...
>>>
>>> static void sock_shutdown(struct nbd_device *nbd)
>>> {
>>>
>>> ....
>>>
>>>           if (config->num_connections == 0)
>>>                   return;
>>>
>>>
>>> num_connections is zero for your case since we never did a
>>> nbd_genl_disconnect so we would return here.
>>>
>>>
>>> .
>>>
>> Hi Mike
>>
>> Your point is not right totally.
>>
>> Yes, config->num_connections is 0 and will return in sock_shutdown. Then
>> it will back to nbd_disconnect_and_put and do flush_workqueue
>> (nbd->recv_workq).
>>
>> nbd_disconnect_and_put
>>      ->nbd_clear_sock
>>          ->sock_shutdown
>>      ->flush_workqueue
>>
> 
> My patch removed that extra flush_workqueue in nbd_disconnect_and_put.
> 
> The idea of the patch was to move the flush calls to when we do
> sock_shutdown in the config (connect, disconnect, clear sock) code
> paths, because that is the time we know we will need to kill the recv
> workers and wait for them to complete so we know they are not still
> running when userspace does a new config operation.
> 
Yes, I see.