lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <95753732-9714-42e0-8097-e2b4c3dd5820@linux.ibm.com>
Date: Thu, 22 May 2025 18:07:21 +0530
From: Nilay Shroff <nilay@...ux.ibm.com>
To: kernel test robot <oliver.sang@...el.com>
Cc: oe-lkp@...ts.linux.dev, lkp@...el.com, linux-kernel@...r.kernel.org,
        Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>,
        Hannes Reinecke <hare@...e.de>, Ming Lei <ming.lei@...hat.com>,
        cgroups@...r.kernel.org, linux-block@...r.kernel.org
Subject: Re: [linus:master] [block] 245618f8e4: stress-ng.fpunch.fail



On 5/22/25 7:59 AM, kernel test robot wrote:
> 
> 
> Hello,
> 
> 
> we don't have enough knowledge if this is a kernel issue or test case issue.
> 
> =========================================================================================
> tbox_group/testcase/rootfs/kconfig/compiler/nr_threads/disk/testtime/fs/test/cpufreq_governor:
>   lkp-icl-2sp4/stress-ng/debian-12-x86_64-20240206.cgz/x86_64-rhel-9.4/gcc-12/100%/1HDD/60s/xfs/fpunch/performance
> 
> 3efe7571c3ae2b64 245618f8e45ff4f79327627b474
> ---------------- ---------------------------
>        fail:runs  %reproduction    fail:runs
>            |             |             |
>            :6          100%           6:6     stress-ng.fpunch.fail
> 
> since the failure is persistent, just report what we observed in our tests FYI.
> 
> 
> kernel test robot noticed "stress-ng.fpunch.fail" on:
> 
> commit: 245618f8e45ff4f79327627b474b563da71c2c75 ("block: protect wbt_lat_usec using q->elevator_lock")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> [test failed on linus/master      b36ddb9210e6812eb1c86ad46b66cc46aa193487]
> [test failed on linux-next/master 8566fc3b96539e3235909d6bdda198e1282beaed]
> [test failed on fix commit        9730763f4756e32520cb86778331465e8d063a8f]
> 
> in testcase: stress-ng
> version: stress-ng-x86_64-1c71921fd-1_20250212
> with following parameters:
> 
> 	nr_threads: 100%
> 	disk: 1HDD
> 	testtime: 60s
> 	fs: xfs
> 	test: fpunch
> 	cpufreq_governor: performance
> 
> 
> 
> config: x86_64-rhel-9.4
> compiler: gcc-12
> test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
> 
> (please refer to attached dmesg/kmsg for entire log/backtrace)
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@...el.com>
> | Closes: https://lore.kernel.org/oe-lkp/202505221030.760980df-lkp@intel.com
> 
> 2025-03-20 08:33:52 mkdir -p /mnt/stress-ng
> 2025-03-20 08:33:52 mount /dev/sdc1 /mnt/stress-ng
> 2025-03-20 08:33:52 cd /mnt/stress-ng
>   File: "/mnt/stress-ng"
>     ID: 82100000000 Namelen: 255     Type: xfs
> Block size: 4096       Fundamental block size: 4096
> Blocks: Total: 78604800   Free: 78518242   Available: 78518242
> Inodes: Total: 157286400  Free: 157286397
> 2025-03-20 08:33:52 stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --fpunch 128
> stress-ng: info:  [4680] setting to a 1 min run per stressor
> stress-ng: info:  [4680] dispatching hogs: 128 fpunch
> stress-ng: info:  [4680] note: /proc/sys/kernel/sched_autogroup_enabled is 1 and this can impact scheduling throughput for processes not attached to a tty. Setting this to 0 may improve performance metrics
> stress-ng: warn:  [4680] metrics-check: all bogo-op counters are zero, data may be incorrect
> stress-ng: metrc: [4680] stressor       bogo ops real time  usr time  sys time   bogo ops/s     bogo ops/s CPU used per       RSS Max
> stress-ng: metrc: [4680]                           (secs)    (secs)    (secs)   (real time) (usr+sys time) instance (%)          (KB)
> stress-ng: metrc: [4680] fpunch                0    557.92      0.40     19.56         0.00           0.00         0.03          3180
> stress-ng: metrc: [4680] miscellaneous metrics:
> stress-ng: metrc: [4680] fpunch              2049.12 extents per file (geometric mean of 128 instances)
> stress-ng: info:  [4680] for a 620.45s run time:
> stress-ng: info:  [4680]   79418.05s available CPU time
> stress-ng: info:  [4680]       0.40s user time   (  0.00%)
> stress-ng: info:  [4680]      19.59s system time (  0.02%)
> stress-ng: info:  [4680]      19.99s total time  (  0.03%)
> stress-ng: info:  [4680] load average: 250.69 349.62 213.80
> stress-ng: info:  [4680] skipped: 0
> stress-ng: info:  [4680] passed: 128: fpunch (128)
> stress-ng: info:  [4680] failed: 0
> stress-ng: info:  [4680] metrics untrustworthy: 0
> stress-ng: info:  [4680] successful run completed in 10 mins, 20.45 secs
> 
> 
> we don't observe any abnormal output in dmesg. below is an example from parent
> commit.
> 
> 2025-03-20 09:12:39 mkdir -p /mnt/stress-ng
> 2025-03-20 09:12:39 mount /dev/sdc1 /mnt/stress-ng
> 2025-03-20 09:12:39 cd /mnt/stress-ng
>   File: "/mnt/stress-ng"
>     ID: 82100000000 Namelen: 255     Type: xfs
> Block size: 4096       Fundamental block size: 4096
> Blocks: Total: 78604800   Free: 78518242   Available: 78518242
> Inodes: Total: 157286400  Free: 157286397
> 2025-03-20 09:12:39 stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --fpunch 128
> stress-ng: info:  [4689] setting to a 1 min run per stressor
> stress-ng: info:  [4689] dispatching hogs: 128 fpunch
> stress-ng: info:  [4689] note: /proc/sys/kernel/sched_autogroup_enabled is 1 and this can impact scheduling throughput for processes not attached to a tty. Setting this to 0 may improve performance metrics
> stress-ng: metrc: [4689] stressor       bogo ops real time  usr time  sys time   bogo ops/s     bogo ops/s CPU used per       RSS Max
> stress-ng: metrc: [4689]                           (secs)    (secs)    (secs)   (real time) (usr+sys time) instance (%)          (KB)
> stress-ng: metrc: [4689] fpunch             1166     60.31      0.11     34.66        19.33          33.54         0.45          3164
> stress-ng: metrc: [4689] miscellaneous metrics:
> stress-ng: metrc: [4689] fpunch              2051.97 extents per file (geometric mean of 128 instances)
> stress-ng: info:  [4689] for a 60.91s run time:
> stress-ng: info:  [4689]    7796.93s available CPU time
> stress-ng: info:  [4689]       0.11s user time   (  0.00%)
> stress-ng: info:  [4689]      34.68s system time (  0.44%)
> stress-ng: info:  [4689]      34.79s total time  (  0.45%)
> stress-ng: info:  [4689] load average: 325.78 93.83 32.28
> stress-ng: info:  [4689] skipped: 0
> stress-ng: info:  [4689] passed: 128: fpunch (128)
> stress-ng: info:  [4689] failed: 0
> stress-ng: info:  [4689] metrics untrustworthy: 0
> stress-ng: info:  [4689] successful run completed in 1 min
> 
> 
> from above, parent can finish run in 1 min, then has "bogo ops" and "bogo ops/s"
> 
> for 245618f8e4, the test seems run much longer, and the results for "bogo ops"
> and "bogo ops/s" are all 0.
> 
> 
> 
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20250522/202505221030.760980df-lkp@intel.com
> 

I tried reproducing this issue but I couldn't recreate it. Is it possible
for you to run this test on your setup using stress-ng option "--iostat 1"
as shown below ?

# stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --fpunch 128 --iostat 1

If you can run test with above option then please collect logs and share it.
That might help to further debug this.

Thanks,
--Nilay


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ