lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z4efKYwbf2QYBx40@ryzen>
Date: Wed, 15 Jan 2025 12:42:33 +0100
From: Niklas Cassel <cassel@...nel.org>
To: Oliver Sang <oliver.sang@...el.com>
Cc: Christoph Hellwig <hch@....de>, oe-lkp@...ts.linux.dev, lkp@...el.com,
	linux-kernel@...r.kernel.org, Jens Axboe <axboe@...nel.dk>,
	linux-block@...r.kernel.org, virtualization@...ts.linux.dev,
	linux-nvme@...ts.infradead.org, Damien Le Moal <dlemoal@...nel.org>,
	linux-btrfs@...r.kernel.org, linux-aio@...ck.org
Subject: Re: [linus:master] [block]  e70c301fae: stress-ng.aiol.ops_per_sec
 49.6% regression

Hello Oliver,

On Fri, Jan 10, 2025 at 02:53:08PM +0800, Oliver Sang wrote:
> On Wed, Jan 08, 2025 at 11:39:28AM +0100, Niklas Cassel wrote:
> > > > Oliver, which I/O scheduler are you using?
> > > > $ cat /sys/block/sdb/queue/scheduler 
> > > > none mq-deadline kyber [bfq]
> > > 
> > > while our test running:
> > > 
> > > # cat /sys/block/sdb/queue/scheduler
> > > none [mq-deadline] kyber bfq
> > 
> > The stddev numbers you showed is all over the place, so are we certain
> > if this is a regression caused by commit e70c301faece ("block:
> > don't reorder requests in blk_add_rq_to_plug") ?
> > 
> > Do you know if the stddev has such big variation for this test even before
> > the commit?
> 
> in order to address your concern, we rebuild kernels for e70c301fae and its
> parent a3396b9999, also for v6.12-rc4. the config is still same as shared
> in our original report:
> https://download.01.org/0day-ci/archive/20241212/202412122112.ca47bcec-lkp@intel.com/config-6.12.0-rc4-00120-ge70c301faece

Thank you for putting in the work to do some extra tests.

(Doing performance regression testing is really important IMO,
as without it you are essentially in the blind.
Thank you guys for taking on the role of this important work!)


Looking at the extended number of iterations that you've in this email,
it is quite clear that e70c301faece, at least with the workload provided
by stress-ng + mq-deadline, introduced a regression:

       v6.12-rc4 a3396b99990d8b4e5797e7b16fd e70c301faece15b618e54b613b1
---------------- --------------------------- ---------------------------
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \
    187.64 ±  5%      -0.6%     186.48 ±  7%     -47.6%      98.29 ± 17%  stress-ng.aiol.ops_per_sec




Looking at your results from stress-ng + none scheduler:

         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \
    114.62 ± 19%      -1.9%     112.49 ± 17%     -32.4%      77.47 ± 21%  stress-ng.aiol.ops_per_sec


Which shows a change, but -32% rather than -47%, also seems to suggest a
regression for the stress-ng workload.




Looking closer at the raw number for stress-ng + none scheduler, in your
other email, it seems clear that the raw values from the stress-ng workload
can vary quite a lot. In the long run, I wonder if we perhaps can find a
workload that has less variation. E.g. fio test for IOPS and fio test for
throughout. But perhaps such workloads are already part of lkp-tests?


Kind regards,
Niklas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ