[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f33af0f8-6d7b-479c-9d57-e5fd485d0f6e@linux.ibm.com>
Date: Fri, 30 May 2025 19:31:28 +0530
From: Nilay Shroff <nilay@...ux.ibm.com>
To: Oliver Sang <oliver.sang@...el.com>
Cc: oe-lkp@...ts.linux.dev, lkp@...el.com, linux-kernel@...r.kernel.org,
Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>,
Hannes Reinecke <hare@...e.de>, Ming Lei <ming.lei@...hat.com>,
cgroups@...r.kernel.org, linux-block@...r.kernel.org
Subject: Re: [linus:master] [block] 245618f8e4: stress-ng.fpunch.fail
On 5/29/25 7:22 AM, Oliver Sang wrote:
> hi, Nilay,
>
> sorry for late.
No worries...
[...]
>>>
>>> The kernel config and materials to reproduce are available at:
>>> https://download.01.org/0day-ci/archive/20250522/202505221030.760980df-lkp@intel.com
>>>
>>
>> I tried reproducing this issue but I couldn't recreate it. Is it possible
>> for you to run this test on your setup using stress-ng option "--iostat 1"
>> as shown below ?
>>
>> # stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --fpunch 128 --iostat 1
>>
>> If you can run test with above option then please collect logs and share it.
>> That might help to further debug this.
>
> the log is attached as stress-ng-245618f8e4.
> also attached the dmesg-245618f8e4.xz.
>
> another log from parent is attached as stress-ng-3efe7571c3.
>
Thanks for trying out --iostat option and sharing logs. I looked through logs and it seems
that (my guess) in case of failures (i.e. bogo ops reported as 0) disk read operations are
either blocked or never completed. However it might be useful to further debug this.
Unfortunately, I tried hard but failed to recreate on my setup, so need your help.
I have few follow up questions:
1. Are you able to recreate this issue even on the recent upstream kernel?
2. Did you try formatting the disk using ext4 instead of xfs?
Anyways, is it possible to rerun test with following options to further analyze it?
# stress-ng --timeout 60 --times --metrics --verify --no-rand-seed --fpunch 128 --verbose --klog-check --stressor-time --status 1
Above options shall help generate verbose output as well as log why stressors are not exiting
after timeout of 60 seconds. Moreover, it'd be helpful if you can also repeat the test specifying
"--fpunch 1". Just wanted to see whether limiting stressors to only 1 recreate the issue.
Thanks,
--Nilay
Powered by blists - more mailing lists