lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20230201144314.ukbaxy7vbgftaebr@revolver>
Date:   Wed, 1 Feb 2023 09:43:14 -0500
From:   "Liam R. Howlett" <Liam.Howlett@...cle.com>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     Yujie Liu <yujie.liu@...el.com>, oe-lkp@...ts.linux.dev,
        lkp@...el.com, linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-fsdevel@...r.kernel.org
Subject: Re: [linus:master] [maple_tree] 120b116208:
 INFO:task_blocked_for_more_than#seconds

...

> > > > > > 
> > > > > > FYI, we noticed INFO:task_blocked_for_more_than#seconds due to commit (built with clang-14):
> > > > > > 
> > > > > > commit: 120b116208a0877227fc82e3f0df81e7a3ed4ab1 ("maple_tree: reorganize testing to restore module testing")
> > > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > > > > 
> > > > > > in testcase: boot
> > > > > > 
> > > > > > on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> > > > > > 
...

> > > > > > If you fix the issue, kindly add following tag
> > > > > > | Reported-by: kernel test robot <yujie.liu@...el.com>
> > > > > > | Link: https://lore.kernel.org/oe-lkp/202301310940.4a37c7af-yujie.liu@intel.com
> > > > > 
> > > > > Liam brought this to my attention on IRC, and it looks like the root
> > > > > cause is that the rcuscale code does not deal gracefully with grace
> > > > > periods that are in much excess of a second in duration.
> > > > > 
> > > > > Now, it might well be worth looking into why the grace periods were taking
> > > > > that long, but if you were running Maple Tree stress tests concurrently
> > > > > with rcuscale, this might well be expected behavior.
> > > > > 
> > > > 
> > > > This could be simply cpu starvation causing no foward progress in your
> > > > tests with the number of concurrent running tests and "-smp 2".
> > > > 
> > > > It's also worth noting that building in the rcu test module makes the
> > > > machine turn off once the test is complete.  This can be seen in your
> > > > console message:
> > > > [   13.254240][    T1] rcu-scale:--- Start of test: nreaders=2 nwriters=2 verbose=1 shutdown=1
> > > > 
> > > > so your machine may not have finished running through the array of tests
> > > > you have specified to build in - which is a lot.  I'm not sure if this
> > > > is the best approach considering the load that produces on the system
> > > > and how difficult it is (was) to figure out which test is causing a
> > > > stall, or other issue.
> > > 
> > > Agreed, both rcuscale and refscale when built in turn the machine off at
> > > the end of the test.  For providing background stress for some other test
> > > (in this case Maple Tree tests), rcutorture, locktorture, or scftorture
> > > might be better choices.
> > 
> > Thanks for looking into this. This is a boot test on a randconfig
> > kernel, and yes, it happend to select CONFIG_RCU_SCALE_TEST=y together
> > with CONFIG_TEST_MAPLE_TREE=y, leading to the situation in this case.

Ah, I see.  Thanks for that information, this makes more sense now.

> > 
> > I've tested the patch on same config, it does clear up the "task
> > blocked" log, though it still waits a long time at this step. The test
> > result is as follows:
> > 
> > [   18.397784][    T1] calling  maple_tree_seed+0x0/0x15d0 @ 1
> > [   18.398646][    T1]
> > [   18.398646][    T1] TEST STARTING
> > [   18.398646][    T1]
> > [ 1266.450656][    T1] maple_tree: 12610686 of 12610686 tests passed
> > [ 1266.451749][    T1] initcall maple_tree_seed+0x0/0x15d0 returned 0 after 1248053116 usecs
> > ...

Thanks.  Yes, I have a lot of tests in there that add up to taking a
while.  This is the expected output.

...

Thanks,
Liam

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ