lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aaf77015-3039-6b04-3417-d376e3467444@huawei.com>
Date:   Wed, 2 Dec 2020 11:18:31 +0000
From:   John Garry <john.garry@...wei.com>
To:     Ming Lei <ming.lei@...hat.com>
CC:     <axboe@...nel.dk>, <linux-block@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <hch@....de>, <hare@...e.de>,
        <ppvk@...eaurora.org>, <bvanassche@....org>,
        <kashyap.desai@...adcom.com>
Subject: Re: [RFC PATCH] blk-mq: Clean up references when freeing rqs

On 02/12/2020 03:31, Ming Lei wrote:
> On Tue, Dec 01, 2020 at 09:02:18PM +0800, John Garry wrote:
>> It has been reported many times that a use-after-free can be intermittently
>> found when iterating busy requests:
>>
>> - https://lore.kernel.org/linux-block/8376443a-ec1b-0cef-8244-ed584b96fa96@huawei.com/
>> - https://lore.kernel.org/linux-block/5c3ac5af-ed81-11e4-fee3-f92175f14daf@acm.org/T/#m6c1ac11540522716f645d004e2a5a13c9f218908
>> - https://lore.kernel.org/linux-block/04e2f9e8-79fa-f1cb-ab23-4a15bf3f64cc@kernel.dk/
>>
>> The issue is that when we switch scheduler or change queue nr_requests,
>> the driver tagset may keep references to the stale requests.
>>
>> As a solution, clean up any references to those requests in the driver
>> tagset when freeing. This is done with a cmpxchg to make safe any race
>> with setting the driver tagset request from another queue.
>>
>> Signed-off-by: John Garry <john.garry@...wei.com>
>> --
>> Set as RFC as I need to test more. And not sure on solution method, as
>> Bart had another idea.
>>
>> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
>> index d1eafe2c045c..9b042c7036b3 100644
>> --- a/block/blk-mq-sched.c
>> +++ b/block/blk-mq-sched.c
>> @@ -621,7 +621,7 @@ void blk_mq_sched_free_requests(struct request_queue *q)
>>   
>>   	queue_for_each_hw_ctx(q, hctx, i) {
>>   		if (hctx->sched_tags)
>> -			blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i);
>> +			blk_mq_free_rqs_ext(q->tag_set, hctx->sched_tags, i, hctx->tags);
>>   	}
>>   }
>>   
>> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
>> index 9c92053e704d..562db72e7d79 100644
>> --- a/block/blk-mq-tag.c
>> +++ b/block/blk-mq-tag.c
>> @@ -576,7 +576,7 @@ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
>>   			return -ENOMEM;
>>   		}
>>   
>> -		blk_mq_free_rqs(set, *tagsptr, hctx->queue_num);
>> +		blk_mq_free_rqs_ext(set, *tagsptr, hctx->queue_num, hctx->tags);
>>   		blk_mq_free_rq_map(*tagsptr, flags);
>>   		*tagsptr = new;
>>   	} else {
>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>> index 55bcee5dc032..f3aad695cd25 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -2271,8 +2271,8 @@ blk_qc_t blk_mq_submit_bio(struct bio *bio)
>>   	return BLK_QC_T_NONE;
>>   }
>>   
>> -void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
>> -		     unsigned int hctx_idx)
>> +void blk_mq_free_rqs_ext(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
>> +		     unsigned int hctx_idx, struct blk_mq_tags *references)
>>   {
>>   	struct page *page;
>>   
>> @@ -2281,10 +2281,13 @@ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
>>   
>>   		for (i = 0; i < tags->nr_tags; i++) {
>>   			struct request *rq = tags->static_rqs[i];
>> +			int j;
>>   
>>   			if (!rq)
>>   				continue;
>>   			set->ops->exit_request(set, rq, hctx_idx);
>> +			for (j = 0; references && j < references->nr_tags; j++)
>> +				cmpxchg(&references->rqs[j], rq, 0);
> 
> Seems you didn't address the comment in the following link:
> 
> 	https://lore.kernel.org/linux-block/10331543-9e45-ae63-8cdb-17e5a2a3b7ef@huawei.com/
> 
> The request to be freed may still be refered in another path, such as blk_mq_queue_tag_busy_iter
> or blk_mq_tagset_busy_iter(), and cmpxchg() doesn't drain/wait for other refers.
> 

Hi Ming,

Yeah, so I said that was another problem which you mentioned there, 
which I'm not addressing, but I don't think that I'm making thing worse 
here.

So AFAICS, the blk-mq/sched code doesn't wait for any "readers" to be 
finished, such as those running blk_mq_queue_tag_busy_iter or 
blk_mq_tagset_busy_iter() in another context.

So how about the idea of introducing some synchronization primitive, 
such as semaphore, which those "readers" must grab and release at start 
and end (of iter), to ensure the requests are not freed during the 
iteration?

Thanks,
John



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ