linux-kernel - Re: [PATCH V3 00/11] block-throttle: add .high limit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20161004172852.GB73678@anikkar-mbp.local.dhcp.thefacebook.com>
Date:   Tue, 4 Oct 2016 10:28:53 -0700
From:   Shaohua Li <shli@...com>
To:     Paolo Valente <paolo.valente@...more.it>
CC:     Tejun Heo <tj@...nel.org>, Vivek Goyal <vgoyal@...hat.com>,
        <linux-block@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        Jens Axboe <axboe@...com>, <Kernel-team@...com>,
        <jmoyer@...hat.com>, Mark Brown <broonie@...nel.org>,
        Linus Walleij <linus.walleij@...aro.org>,
        Ulf Hansson <ulf.hansson@...aro.org>
Subject: Re: [PATCH V3 00/11] block-throttle: add .high limit

On Tue, Oct 04, 2016 at 07:01:39PM +0200, Paolo Valente wrote:
> 
> > Il giorno 04 ott 2016, alle ore 18:27, Tejun Heo <tj@...nel.org> ha scritto:
> > 
> > Hello,
> > 
> > On Tue, Oct 04, 2016 at 06:22:28PM +0200, Paolo Valente wrote:
> >> Could you please elaborate more on this point?  BFQ uses sectors
> >> served to measure service, and, on the all the fast devices on which
> >> we have tested it, it accurately distributes
> >> bandwidth as desired, redistributes excess bandwidth with any issue,
> >> and guarantees high responsiveness and low latency at application and
> >> system level (e.g., ~0 drop rate in video playback, with any background
> >> workload tested).
> > 
> > The same argument as before.  Bandwidth is a very bad measure of IO
> > resources spent.  For specific use cases (like desktop or whatever),
> > this can work but not generally.
> > 
> 
> Actually, we have already discussed this point, and IMHO the arguments
> that (apparently) convinced you that bandwidth is the most relevant
> service guarantee for I/O in desktops and the like, prove that
> bandwidth is the most important service guarantee in servers too.
> 
> Again, all the examples I can think of seem to confirm it:
> . file hosting: a good service must guarantee reasonable read/write,
> i.e., download/upload, speeds to users
> . file streaming: a good service must guarantee low drop rates, and
> this can be guaranteed only by guaranteeing bandwidth and latency
> . web hosting: high bandwidth and low latency needed here too
> . clouds: high bw and low latency needed to let, e.g., users of VMs
> enjoy high responsiveness and, for example, reasonable file-copy
> time
> ...
> 
> To put in yet another way, with packet I/O in, e.g., clouds, there are
> basically the same issues, and the main goal is again guaranteeing
> bandwidth and low latency among nodes.
> 
> Could you please provide a concrete server example (assuming we still
> agree about desktops), where I/O bandwidth does not matter while time
> does?

I don't think IO bandwidth does not matter. The problem is bandwidth can't
measure IO cost. For example, you can't say 8k IO costs 2x IO resource than 4k
IO.

> >> Could you please suggest me some test to show how sector-based
> >> guarantees fails?
> > 
> > Well, mix 4k random and sequential workloads and try to distribute the
> > acteual IO resources.
> > 
> 
> 
> If I'm not mistaken, we have already gone through this example too,
> and I thought we agreed on what service scheme worked best, again
> focusing only on desktops.  To make a long story short(er), here is a
> snippet from one of our last exchanges.
> 
> ----------
> 
> On Sat, Apr 16, 2016 at 12:08:44AM +0200, Paolo Valente wrote:
> > Maybe the source of confusion is the fact that a simple sector-based,
> > proportional share scheduler always distributes total bandwidth
> > according to weights. The catch is the additional BFQ rule: random
> > workloads get only time isolation, and are charged for full budgets,
> > so as to not affect the schedule of quasi-sequential workloads. So,
> > the correct claim for BFQ is that it distributes total bandwidth
> > according to weights (only) when all competing workloads are
> > quasi-sequential. If some workloads are random, then these workloads
> > are just time scheduled. This does break proportional-share bandwidth
> > distribution with mixed workloads, but, much more importantly, saves
> > both total throughput and individual bandwidths of quasi-sequential
> > workloads.
> > 
> > We could then check whether I did succeed in tuning timeouts and
> > budgets so as to achieve the best tradeoffs. But this is probably a
> > second-order problem as of now.

I don't see why random/sequential matters for SSD. what really matters is
request size and IO depth. Time scheduling is skeptical too, as workloads can
dispatch all IO within almost 0 time in high queue depth disks.

Thanks,
Shaohua