[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACVXFVP1QvkcD81MvCpQLz993g=jaPgMNFAFa2hvnPjjGcVkzg@mail.gmail.com>
Date: Sat, 18 Mar 2017 02:23:25 +0800
From: Ming Lei <tom.leiming@...il.com>
To: Bart Van Assche <Bart.VanAssche@...disk.com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"hch@...radead.org" <hch@...radead.org>,
"linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
"axboe@...com" <axboe@...com>,
"yizhan@...hat.com" <yizhan@...hat.com>
Subject: Re: [PATCH v1 2/3] blk-mq: comment on races related with timeout handler
On Sat, Mar 18, 2017 at 1:39 AM, Bart Van Assche
<Bart.VanAssche@...disk.com> wrote:
> On Fri, 2017-03-17 at 17:57 +0800, Ming Lei wrote:
>> +/*
>> + * When we reach here because queue is busy, REQ_ATOM_COMPLETE
>> + * flag isn't set yet, so there may be race with timeout hanlder,
>> + * but given rq->deadline is just set in .queue_rq() under
>> + * this sitation, the race won't be possible in reality because
>> + * rq->timeout should be set as big enough to cover the window
>> + * between blk_mq_start_request() called from .queue_rq() and
>> + * clearing REQ_ATOM_STARTED here.
>> + */
>> static void __blk_mq_requeue_request(struct request *rq)
>> {
>> struct request_queue *q = rq->q;
>> @@ -700,6 +709,19 @@ static void blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
>> if (!test_bit(REQ_ATOM_STARTED, &rq->atomic_flags))
>> return;
>>
>> + /*
>> + * The rq being checked may have been freed and reallocated
>> + * out already here, we avoid this race by checking rq->deadline
>> + * and REQ_ATOM_COMPLETE flag together:
>> + *
>> + * - if rq->deadline is observed as new value because of
>> + * reusing, the rq won't be timed out because of timing.
>> + * - if rq->deadline is observed as previous value,
>> + * REQ_ATOM_COMPLETE flag won't be cleared in reuse path
>> + * because we put a barrier between setting rq->deadline
>> + * and clearing the flag in blk_mq_start_request(), so
>> + * this rq won't be timed out too.
>> + */
>> if (time_after_eq(jiffies, rq->deadline)) {
>> if (!blk_mark_rq_complete(rq))
>> blk_mq_rq_timed_out(rq, reserved);
>
> Since this explanation applies to the same race addressed by patch 1/3,
First, this explains how we deal with the race of reuse vs. timeout, and 1/3
fixes another race or rq corruption. Did you see anywhere I mentioned STARTED
flag in above comment?
In case of 1/3, the rq to be dispatched can be destroyed simply by the
blk_mq_end_request() from timeout. Or even it can survive, the same rq
can be allocated into another I/O path, and this situation is different with
reuse vs. timeout. And I can't see any help from the comment for explaining
1/3's issue, can you? Maybe I need to mention rq corruption in 1/3 explicitly.
Secondly introducing this comment to 1/3 just causes unnecessary
backporting burden, as we have to make it into -stable.
> please consider squashing this patch into patch 1/3.
So please do not consider that.
Thanks,
Ming Lei
Powered by blists - more mailing lists