lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <86f0af00-8765-4481-9245-1819fb2c6379@acm.org>
Date: Mon, 4 Mar 2024 16:20:03 -0800
From: Bart Van Assche <bvanassche@....org>
To: Christian Loehle <christian.loehle@....com>, linux-kernel@...r.kernel.org
Cc: peterz@...radead.org, juri.lelli@...hat.com, mingo@...hat.com,
 rafael@...nel.org, dietmar.eggemann@....com, vschneid@...hat.com,
 vincent.guittot@...aro.org, Johannes.Thumshirn@....com,
 adrian.hunter@...el.com, ulf.hansson@...aro.org, andres@...razel.de,
 asml.silence@...il.com, linux-pm@...r.kernel.org,
 linux-block@...r.kernel.org, io-uring@...r.kernel.org,
 Qais Yousef <qyousef@...alina.io>
Subject: Re: [RFC PATCH 0/2] Introduce per-task io utilization boost

On 3/4/24 12:16, Christian Loehle wrote:
> Pixel 6 ufs Android 14 (7 runs for because device showed some variance)
> [6605, 6622, 6633, 6652, 6690, 6697, 6754] sugov mainline
> [7141, 7173, 7198, 7220, 7280, 7427, 7452] per-task tracking
> [2390, 2392, 2406, 2437, 2464, 2487, 2813] sugov no iowait boost
> [7812, 7837, 7837, 7851, 7900, 7959, 7980] performance governor

Variance of performance results for Pixel devices can be reduced greatly
by disabling devfreq scaling, e.g. as follows (this may cause thermal
issues if the system load is high enough):

      for d in $(adb shell echo /sys/class/devfreq/*); do
	adb shell "cat $d/available_frequencies |
		tr ' ' '\n' |
		sort -n |
		case $devfreq in
			min) head -n1;;
			max) tail -n1;;
		esac > $d/min_freq"
     done

> Showcasing some different IO scenarios, again all random read,
> median out of 5 runs, all on rk3399 with NVMe.
> e.g. io_uring6x4 means 6 threads with 4 iodepth each, results can be
> obtained using:
> fio --minimal --time_based --name=test --filename=/dev/nvme0n1 --runtime=30 --rw=randread --bs=4k --ioengine=io_uring --iodepth=4 --numjobs=6 --group_reporting | cut -d \; -f 8

So buffered I/O was used during this test? Shouldn't direct I/O be used
for this kind of tests (--buffered=0)? Additionally, which I/O scheduler
was configured? I recommend --ioscheduler=none for this kind of tests.

> - Higher cap is not always beneficial, we might place the task away
> from the CPU where the interrupt handler is running, making it run
> on an unboosted CPU which may have a bigger impact than the difference
> between the CPU's capacity the task moved to. (Of course the boost will
> then be reverted again, but a ping-pong every interval is possible).

In the above I see "the interrupt handler". Does this mean that the NVMe
controller in the test setup only supports one completion interrupt for
all completion queues instead of one completion interrupt per completion
queue? There are already Android phones and developer boards available
that support the latter, namely the boards equipped with a UFSHCI 4.0 
controller.

Thanks,

Bart.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ