lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aG98fj7phkM1PojW@xsang-OptiPlex-9020>
Date: Thu, 10 Jul 2025 16:40:30 +0800
From: Oliver Sang <oliver.sang@...el.com>
To: Nilay Shroff <nilay@...ux.ibm.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>, Hannes Reinecke
	<hare@...e.de>, Ming Lei <ming.lei@...hat.com>, <cgroups@...r.kernel.org>,
	<linux-block@...r.kernel.org>, <oliver.sang@...el.com>
Subject: Re: [linus:master] [block] 245618f8e4: stress-ng.fpunch.fail

hi, Nilay,

really sorry for long delay. we are blocked by other issues for a long time.

for this report, the test machine is redeployed for other usages, and I tried
the same stress-ng fpunch test on another Ice Lake server, cannot reproduce the
issue again on 245618f8e4 or latest mainline.

seems the previous test machine has some problem. sorry for our env problem.


On Fri, May 30, 2025 at 07:31:28PM +0530, Nilay Shroff wrote:
> 
> 
> On 5/29/25 7:22 AM, Oliver Sang wrote:
> > hi, Nilay,
> > 
> > sorry for late.
> No worries... 
> 
> [...]
> >>>
> >>> The kernel config and materials to reproduce are available at:
> >>> https://download.01.org/0day-ci/archive/20250522/202505221030.760980df-lkp@intel.com
> >>>
> >>
> >> I tried reproducing this issue but I couldn't recreate it. Is it possible
> >> for you to run this test on your setup using stress-ng option "--iostat 1"
> >> as shown below ?
> >>
> >> # stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --fpunch 128 --iostat 1
> >>
> >> If you can run test with above option then please collect logs and share it.
> >> That might help to further debug this.
> > 
> > the log is attached as stress-ng-245618f8e4.
> > also attached the dmesg-245618f8e4.xz.
> > 
> > another log from parent is attached as stress-ng-3efe7571c3.
> > 
> Thanks for trying out --iostat option and sharing logs. I looked through logs and it seems 
> that (my guess) in case of failures (i.e. bogo ops reported as 0) disk read operations are
> either blocked or never completed. However it might be useful to further debug this. 
> Unfortunately, I tried hard but failed to recreate on my setup, so need your help. 
> 
> I have few follow up questions:
> 1. Are you able to recreate this issue even on the recent upstream kernel?
> 2. Did you try formatting the disk using ext4 instead of xfs?
> 
> Anyways, is it possible to rerun test with following options to further analyze it?
> # stress-ng --timeout 60 --times --metrics --verify --no-rand-seed --fpunch 128 --verbose --klog-check --stressor-time --status 1
> 
> Above options shall help generate verbose output as well as log why stressors are not exiting 
> after timeout of 60 seconds. Moreover, it'd be helpful if you can also repeat the test specifying 
> "--fpunch 1". Just wanted to see whether limiting stressors to only 1 recreate the issue. 
> 
> Thanks,
> --Nilay
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ