[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aG98fj7phkM1PojW@xsang-OptiPlex-9020>
Date: Thu, 10 Jul 2025 16:40:30 +0800
From: Oliver Sang <oliver.sang@...el.com>
To: Nilay Shroff <nilay@...ux.ibm.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>, Hannes Reinecke
<hare@...e.de>, Ming Lei <ming.lei@...hat.com>, <cgroups@...r.kernel.org>,
<linux-block@...r.kernel.org>, <oliver.sang@...el.com>
Subject: Re: [linus:master] [block] 245618f8e4: stress-ng.fpunch.fail
hi, Nilay,
really sorry for long delay. we are blocked by other issues for a long time.
for this report, the test machine is redeployed for other usages, and I tried
the same stress-ng fpunch test on another Ice Lake server, cannot reproduce the
issue again on 245618f8e4 or latest mainline.
seems the previous test machine has some problem. sorry for our env problem.
On Fri, May 30, 2025 at 07:31:28PM +0530, Nilay Shroff wrote:
>
>
> On 5/29/25 7:22 AM, Oliver Sang wrote:
> > hi, Nilay,
> >
> > sorry for late.
> No worries...
>
> [...]
> >>>
> >>> The kernel config and materials to reproduce are available at:
> >>> https://download.01.org/0day-ci/archive/20250522/202505221030.760980df-lkp@intel.com
> >>>
> >>
> >> I tried reproducing this issue but I couldn't recreate it. Is it possible
> >> for you to run this test on your setup using stress-ng option "--iostat 1"
> >> as shown below ?
> >>
> >> # stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --fpunch 128 --iostat 1
> >>
> >> If you can run test with above option then please collect logs and share it.
> >> That might help to further debug this.
> >
> > the log is attached as stress-ng-245618f8e4.
> > also attached the dmesg-245618f8e4.xz.
> >
> > another log from parent is attached as stress-ng-3efe7571c3.
> >
> Thanks for trying out --iostat option and sharing logs. I looked through logs and it seems
> that (my guess) in case of failures (i.e. bogo ops reported as 0) disk read operations are
> either blocked or never completed. However it might be useful to further debug this.
> Unfortunately, I tried hard but failed to recreate on my setup, so need your help.
>
> I have few follow up questions:
> 1. Are you able to recreate this issue even on the recent upstream kernel?
> 2. Did you try formatting the disk using ext4 instead of xfs?
>
> Anyways, is it possible to rerun test with following options to further analyze it?
> # stress-ng --timeout 60 --times --metrics --verify --no-rand-seed --fpunch 128 --verbose --klog-check --stressor-time --status 1
>
> Above options shall help generate verbose output as well as log why stressors are not exiting
> after timeout of 60 seconds. Moreover, it'd be helpful if you can also repeat the test specifying
> "--fpunch 1". Just wanted to see whether limiting stressors to only 1 recreate the issue.
>
> Thanks,
> --Nilay
>
>
Powered by blists - more mailing lists