linux-kernel - Re: Flush requests not going through IO scheduler

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:	Thu, 12 Nov 2015 14:40:35 +0100
From:	Jan Kara <jack@...e.cz>
To:	Jens Axboe <axboe@...nel.dk>
Cc:	Jeff Moyer <jmoyer@...hat.com>, Jan Kara <jack@...e.cz>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: Flush requests not going through IO scheduler

On Tue 03-11-15 10:24:12, Jens Axboe wrote:
> On 11/03/2015 10:18 AM, Jeff Moyer wrote:
> >Jens Axboe <axboe@...nel.dk> writes:
> >
> >>>>Certainly, the current behavior is undoubtedly broken. The least
> >>>>intrusive fix would be to kick off scheduling when we add it to the
> >>>>request, but the elevator should handle it. Are you going to be up
> >>>>for hacking up a fix?
> >>>
> >>>I have some trouble understanding what do you mean exactly. Do you think we
> >>>should just call __blk_run_queue() after we add the request to
> >>>q->queue_head?
> >>
> >>No, that won't be enough, as it won't always break out of the idle
> >>logic. We need to ensure that the new request is noticed, so that CFQ
> >>knows and can decide to kick off things.
> >
> >Hmm?  __blk_run_queue calls the request_fn, which will call
> >blk_peek_request, which calls __elv_next_request, which will find the
> >request on queue_head.  Right?
> >
> >         while (1) {
> >                 if (!list_empty(&q->queue_head)) {
> >                         rq = list_entry_rq(q->queue_head.next);
> >                         return rq;
> 
> I guess that will bypass the schedule. Ugh, but that's pretty ugly,
> since cfq is still effectively idling. These flush requests really
> should go to an internal scheduler list for dispatch.
> 
> But as a quick fix, it might be enough to just kick off the queue
> with blk_run_queue().

So I was looking more into this and in the end tracked this down to be
mostly a blktrace issue. The first thing is: blk_queue_bio() will actually
kick the queue after the flush request is queued but at that moment, there
is only a request for the initial flush queued and that is invisible to
blktrace so it seems the disk is idle although it is not. After this
request completes, we queue & dispatch the request with data which is
visible in blktrace. So in this case requests are dispatched as they
should. The only question I cannot really answer is why the initial flush
is not visible in the block trace - at least trace_block_rq_issue() tracepoint
and corresponding completion should trigger and should be visible... Anyone
has idea? 

Also blk_insert_flush() can add request directly to q->queue_head when no
flushing is required. I've sent patch to fix that to go through IO
scheduler but it is mostly a non-issue as usually
generic_make_request_checks() removes FLUSH and FUA flags when they are not
needed.

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/