[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200609085002.GB270404@T590>
Date:   Tue, 9 Jun 2020 16:50:02 +0800
From:   Ming Lei <ming.lei@...hat.com>
To:     Josh Snyder <joshs@...flix.com>
Cc:     Jens Axboe <axboe@...nel.dk>,
        Mikulas Patocka <mpatocka@...hat.com>,
        Mike Snitzer <snitzer@...hat.com>, linux-block@...r.kernel.org,
        linux-kernel@...r.kernel.org, Josh Snyder <josh@...e406.com>
Subject: Re: [RFC 2/2] Track io_ticks at microsecond granularity.
On Mon, Jun 08, 2020 at 09:07:24PM -0700, Josh Snyder wrote:
> Previously, we performed truncation of I/O issue/completion times during
> calculation of io_ticks, counting only I/Os which cross a jiffy
> boundary. The effect is a sampling of I/Os: at every boundary between
> jiffies we ask "is there an outstanding I/O" and increment a counter if
> the answer is yes. This produces results that are accurate (they don't
> systematically over- or under-count), but not precise (there is high
> variance associated with only taking 100 samples per second).
> 
> This change modifies the sampling rate from 100Hz to 976562.5Hz (1
> sample per 1024 nanoseconds). I chose this sampling rate by simulating a
> workload in which I/Os are issued randomly (by a Poisson process), and
> processed in constant time: an M/D/∞ system (Kendall's notation). My
> goal was to produce a sampled utilization fraction which was correct to
> one part-per-thousand given one second of samples.
> 
> The tradeoff of the higher sampling rate is increased synchronization
> overhead caused by more frequent compare-and-swap operations. The
> technique of commit 5b18b5a73760 ("block: delete part_round_stats and
> switch to less precise counting") is to allow multiple I/Os to complete
> while performing only one synchronized operation. As we are increasing
> the sample rate by a factor of 10000, we will less frequently be able to
> exercise the synchronization-free code path.
Not sure if we need so precise %util, and ~1M sampling rate may cause to run
cmpxchg() 1M/sec for each partition, which overhead might be observable.
Thanks,
Ming
Powered by blists - more mailing lists
 
