lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <79c6ad70-47d9-47fe-4bb4-33fcf356dd37@amd.com>
Date:   Mon, 4 Jul 2022 13:30:50 +0200
From:   Christian König <christian.koenig@....com>
To:     Uladzislau Rezki <urezki@...il.com>,
        "Alex Xu (Hello71)" <alex_y_xu@...oo.ca>
Cc:     wireguard@...ts.zx2c4.com, "Jason A. Donenfeld" <Jason@...c4.com>,
        Joel Fernandes <joel@...lfernandes.org>, paulmck@...nel.org,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Xinhui.Pan@....com, linux-kernel@...r.kernel.org,
        amd-gfx@...ts.freedesktop.org,
        Suren Baghdasaryan <surenb@...gle.com>, rcu@...r.kernel.org,
        Hridya Valsaraju <hridya@...gle.com>,
        Arve Hjønnevåg <arve@...roid.com>,
        Theodore Ts'o <tytso@....edu>, alexander.deucher@....com,
        Todd Kjos <tkjos@...roid.com>, uladzislau.rezki@...y.com,
        Martijn Coenen <maco@...roid.com>, christian.koenig@....com,
        Christian Brauner <christian@...uner.io>
Subject: Re: CONFIG_ANDROID (was: rcu_sched detected expedited stalls in
 amdgpu after suspend)

Hi guys,

Am 28.06.22 um 22:11 schrieb Uladzislau Rezki:
>> Excerpts from Paul E. McKenney's message of June 28, 2022 2:54 pm:
>>> All you need to do to get the previous behavior is to add something like
>>> this to your defconfig file:
>>>
>>> CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=21000
>>>
>>> Any reason why this will not work for you?

sorry for jumping in so later, I was on vacation for a week.

Well when any RCU period is longer than 20ms and amdgpu in the backtrace 
my educated guess is that we messed up some timeout waiting for the hw.

We usually do wait a few us, but it can be that somebody is waiting for 
ms instead.

So there are some todos here as far as I can see and It would be helpful 
to get a cleaner backtrace if possible.

Regards,
Christian.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ