lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151102122009.GE13433@quack.suse.cz>
Date:	Mon, 2 Nov 2015 13:20:09 +0100
From:	Jan Kara <jack@...e.cz>
To:	Jens Axboe <axboe@...nel.dk>
Cc:	LKML <linux-kernel@...r.kernel.org>, Jeff Moyer <jmoyer@...hat.com>
Subject: Flush requests not going through IO scheduler

Hello,

when looking into a performance issue, I've noticed one interesting thing
in blktrace data:

  8,0    2        0     1.745149746     0  m   N cfq320SN / dispatch_insert
  8,0    2        0     1.745150258     0  m   N cfq320SN / dispatched a request
  8,0    2        0     1.745150524     0  m   N cfq320SN / activate rq, drv=10
  8,0    2     2893     1.745150644 30477  D  WS 495331192 + 192 [git]
  8,0    1     3678     1.746851310     0  C  WS 495331192 + 192 [0]

We wrote the data for transaction commit here.

  8,0    1        0     1.746863220     0  m   N cfq320SN / complete rqnoidle 1
  8,0    1        0     1.746863801     0  m   N cfq320SN / set_slice=27
  8,0    1        0     1.746864439     0  m   N cfq320SN / arm_idle: 8 group_idle: 0

Currently there is no IO queued from jbd2 thread so idle...

  8,0    1     3679     1.746878424   320  A FWFS 495331384 + 8 <- (8,2) 478543928
  8,0    1     3680     1.746879028   320  Q FWFS 495331384 + 8 [jbd2/sda2-8]
  8,0    1     3681     1.746879673   320  G FWFS 495331384 + 8 [jbd2/sda2-8]
  8,0    1     3682     1.746880227   320  I FWFS 495331384 + 8 [jbd2/sda2-8]

Jbd2 thread now queues the commit block.

  8,0    1        0     1.754263523     0  m   N cfq idle timer fired
  8,0    1        0     1.754264733     0  m   N cfq320SN / slice expired t=0

But it was not dispatched and we just idled until timer fired. Then we
started dispatching for other queue and got to dispatching the commit block
only much later.

I've looked into the block layer code and the reason for this behavior
(idling when there is in fact IO to dispatch) is the special handling of
flush requests. When a flush request is submitted, we insert it with
ELEVATOR_INSERT_FLUSH and blk_insert_flush() then handles it. That
eventually just ends up doing something along the lines of:

	list_add_tail(&rq->queuelist, &q->queue_head);

So we add request to the list of requests to dispatch but we don't notify
IO scheduler in any way. Thus IO scheduler won't properly track the
request, won't properly account IO time for it if I'm right etc...

Ideally we should call q->elevator->type->ops.elevator_add_req_fn() to
handle the request but I'm not sure it won't break some assumptions of the
flush code. But at minimum shouldn't we at least try to dispatch the
request?

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ