linux-kernel - Re: [PATCH -next v5 6/6] md: protect md

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1f5c8b12-5eb7-4e67-abac-5bead8382c6d@huaweicloud.com>
Date:   Tue, 11 Apr 2023 09:08:14 +0800
From:   Yu Kuai <yukuai1@...weicloud.com>
To:     Logan Gunthorpe <logang@...tatee.com>,
        Yu Kuai <yukuai1@...weicloud.com>, song@...nel.org
Cc:     linux-kernel@...r.kernel.org, linux-raid@...r.kernel.org,
        yi.zhang@...wei.com, yangerkun@...wei.com,
        "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH -next v5 6/6] md: protect md_thread with rcu

Hi,

在 2023/04/10 23:42, Logan Gunthorpe 写道:
> 
> 
> On 2023-04-10 05:35, Yu Kuai wrote:
>> From: Yu Kuai <yukuai3@...wei.com>
>>
>> Our test reports a uaf for 'mddev->sync_thread':
>>
>> T1                      T2
>> md_start_sync
>>   md_register_thread
>>   // mddev->sync_thread is set
>> 			raid1d
>> 			 md_check_recovery
>> 			  md_reap_sync_thread
>> 			   md_unregister_thread
>> 			    kfree
>>
>>   md_wakeup_thread
>>    wake_up
>>    ->sync_thread was freed
>>
>> Root cause is that there is a small windown between register thread and
>> wake up thread, where the thread can be freed concurrently.
>>
>> Currently, a global spinlock 'pers_lock' is borrowed to protect
>> 'mddev->thread', this problem can be fixed likewise, however, there might
>> be similar problem elsewhere, and use a global lock for all the cases is
>> not good.
>>
>> This patch protect md_thread with rcu.
>>
>> Signed-off-by: Yu Kuai <yukuai3@...wei.com>
>> ---
>>   drivers/md/md-bitmap.c   | 29 ++++++++++++-----
>>   drivers/md/md.c          | 68 +++++++++++++++++++---------------------
>>   drivers/md/md.h          | 10 +++---
>>   drivers/md/raid1.c       |  4 +--
>>   drivers/md/raid1.h       |  2 +-
>>   drivers/md/raid10.c      | 10 ++++--
>>   drivers/md/raid10.h      |  2 +-
>>   drivers/md/raid5-cache.c | 15 +++++----
>>   drivers/md/raid5.c       |  4 +--
>>   drivers/md/raid5.h       |  2 +-
>>   10 files changed, 81 insertions(+), 65 deletions(-)
>>
>> diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
>> index 29fd41ef55a6..b9baeea5605e 100644
>> --- a/drivers/md/md-bitmap.c
>> +++ b/drivers/md/md-bitmap.c
>> @@ -1219,15 +1219,27 @@ static bitmap_counter_t *md_bitmap_get_counter(struct bitmap_counts *bitmap,
>>   					       int create);
>>   
>>   static void mddev_set_timeout(struct mddev *mddev, unsigned long timeout,
>> -			      bool force)
>> +			      bool force, bool protected)
>>   {
>> -	struct md_thread *thread = mddev->thread;
>> +	struct md_thread *thread;
>> +
>> +	if (!protected) {
>> +		rcu_read_lock();
>> +		thread = rcu_dereference(mddev->thread);
>> +	} else {
>> +		thread = rcu_dereference_protected(mddev->thread,
>> +				lockdep_is_held(&mddev->reconfig_mutex));
>> +	}
> 
> Why not just always use rcu_read_lock()? Even if it's safe with
> reconfig_mutex, it wouldn't harm much and would make the code a bit less
> ugly.
> 
Of course, I'll do that in next version.

> 
>> @@ -458,8 +454,10 @@ static void md_submit_bio(struct bio *bio)
>>    */
>>   void mddev_suspend(struct mddev *mddev)
>>   {
>> -	WARN_ON_ONCE(mddev->thread && current == mddev->thread->tsk);
>> -	lockdep_assert_held(&mddev->reconfig_mutex);
>> +	struct md_thread *thread = rcu_dereference_protected(mddev->thread,
>> +			lockdep_is_held(&mddev->reconfig_mutex));
> 
> Do we know that reconfig_mutex is always held when we call
> md_unregister_thread()? Seems plausible, but maybe it's worth adding a
> lockdep_assert_held() to md_unregsiter_thread().

Unfortunally this is not true for now, md_unregister_thread() can be
called without this mutex from action_store(), and this is problematic,
I'm tring to revert this change in the other thread:

md: fix that MD_RECOVERY_RUNNING can be cleared while sync_thread is
still running.

I think it's not good to add lockdep_assert_held() for now...

Thanks,
Kuai
> 
> Thanks,
> 
> Logan
> .
>