linux-kernel - Re: Read I/O starvation with writeback RAID controller

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1361393295.3667.21.camel@haakon2.linux-iscsi.org>
Date:	Wed, 20 Feb 2013 12:48:15 -0800
From:	"Nicholas A. Bellinger" <nab@...ux-iscsi.org>
To:	Martin Svec <martin.svec@...er.cz>
Cc:	target-devel <target-devel@...r.kernel.org>,
	linux-kernel@...r.kernel.org,
	linux-scsi <linux-scsi@...r.kernel.org>
Subject: Re: Read I/O starvation with writeback RAID controller

Hi Martin,

CC'ing linux-scsi here, as aacraid doesn't have an official maintainer
atm.

--nab

On Wed, 2013-02-20 at 16:38 +0100, Martin Svec wrote:
> Hello,
> 
> I've noticed read I/O starvation problems of LIO iSCSI target when
> used on top of writeback-enabled HW RAID controller (PERC H700 with
> 1GB cache). For intensive mixed read-write workload in virtualized
> environments, writes are able to consume over 95% of the IOPS
> throughput and cause starvation of reads.
> 
> After a number of tests it seems to me it's a general issue of block
> layer I/O scheduling when running on top of a writeback device. If
> there is a write-intensive task, all writes go to the writeback cache
> with near-zero latency. This allows writer to quickly saturate the
> device with thousands of writes while using only a minimal fraction of
> queue depth. However, non-cached reads depend on spinning drive
> latencies which are orders of magnitude higher than writeback cache
> latencies, and so readers cannot submit so many requests per second as
> writers. Consequently, I guess the controller has totally wrong view
> of the incoming workload pattern, tries to satisfy the write flood
> first and the net result is inacceptable starvation of reads, with
> latencies up to hundreds of milliseconds.
> 
> A simple fio test with 1TiB block device where one thread does 4k
> random sync writes with iodepth=32 and one thread does 4k random reads
> with iodepth=32 shows that instead of the theoretical 50:50 IOPS
> ratio, the block device runs with 95:5 ratio in favor of writes. In
> fact, the imbalance is so high that even write iodepth=2 is enaugh to
> achieve the same numbers.
> 
> Real workloads that tend to exhibit this problem are: initial zeroing
> of a virtual machine disk, virtual machine migration, virtual machine
> cloning, intensive swapping of one virtual machine etc.
> 
> I tried to set WCE=1 on target iblock device, played with queue
> depths, tested all three I/O schedulers and their parameters,
> controller's parameters, but with no luck. To achieve reasonably good
> fairness, the only solution is to set nr_requests to 1 or disable
> controller's writeback cache at all -- at the expense of degraded
> overall performance :-(
> 
> Regarding nr_requests, there's obvious relation between iodepths and
> read starvation: if (nr_requests >= workload iodepth) then starvation
> surely occurs. Lowering nr_requests below this threshold slowly starts
> improving fairness and for every rd+wr iodepths pair, there exists
> sufficiently low nr_requests value at which IOPS ratio is finally
> balanced according to rd:wr iodepth ratio. Unfortunately it means
> there is no minimal nr_requests value suitable for all workloads. For
> iodepths around 2 to 8, only nr_requests=1 provides fair load balancing.
> 
> Is this a known problem? Does anybody find block layer parameters that
> elliminate this problem for iscsi-target storage in mixed random
> read-write environments like virtualization? Or should I start writing
> my own I/O scheduler? ;-)
> 
> Update: I've just found https://lkml.org/lkml/2012/12/10/550 (Read
> starvation by sync writes), where Jan Kara describes identical
> symptoms. But setting nr_requests=10000 doesn't help in my case.
> CC'ing LKML too (I'm not LKML subscriber).
> 
> Thanks,
> 
> Martin
> 
> --
> To unsubscribe from this list: send the line "unsubscribe target-devel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/