lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <4ac1b046-b3bc-6090-f03a-eb6352f52a5a@amd.com> Date: Thu, 24 Nov 2022 16:11:22 +0530 From: "Lazar, Lijo" <lijo.lazar@....com> To: "Quan, Evan" <Evan.Quan@....com>, 李真能 <lizhenneng@...inos.cn>, Michel Dänzer <michel.daenzer@...lbox.org>, "Koenig, Christian" <Christian.Koenig@....com>, "Deucher, Alexander" <Alexander.Deucher@....com> Cc: "dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>, "Pan, Xinhui" <Xinhui.Pan@....com>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "amd-gfx@...ts.freedesktop.org" <amd-gfx@...ts.freedesktop.org> Subject: Re: [PATCH] drm/amdgpu: add mb for si On 11/24/2022 3:34 PM, Quan, Evan wrote: > [AMD Official Use Only - General] > > Could the attached patch help? > > Evan >> -----Original Message----- >> From: amd-gfx <amd-gfx-bounces@...ts.freedesktop.org> On Behalf Of ??? >> Sent: Friday, November 18, 2022 5:25 PM >> To: Michel Dänzer <michel.daenzer@...lbox.org>; Koenig, Christian >> <Christian.Koenig@....com>; Deucher, Alexander >> <Alexander.Deucher@....com> >> Cc: amd-gfx@...ts.freedesktop.org; Pan, Xinhui <Xinhui.Pan@....com>; >> linux-kernel@...r.kernel.org; dri-devel@...ts.freedesktop.org >> Subject: Re: [PATCH] drm/amdgpu: add mb for si >> >> >> 在 2022/11/18 17:18, Michel Dänzer 写道: >>> On 11/18/22 09:01, Christian König wrote: >>>> Am 18.11.22 um 08:48 schrieb Zhenneng Li: >>>>> During reboot test on arm64 platform, it may failure on boot, so add >>>>> this mb in smc. >>>>> >>>>> The error message are as follows: >>>>> [ 6.996395][ 7] [ T295] [drm:amdgpu_device_ip_late_init >>>>> [amdgpu]] *ERROR* >>>>> late_init of IP block <si_dpm> failed -22 [ >>>>> 7.006919][ 7] [ T295] amdgpu 0000:04:00.0: The issue is happening in late_init() which eventually does ret = si_thermal_enable_alert(adev, false); Just before this, si_thermal_start_thermal_controller is called in hw_init and that enables thermal alert. Maybe the issue is with enable/disable of thermal alerts in quick succession. Adding a delay inside si_thermal_start_thermal_controller might help. Thanks, Lijo >>>>> amdgpu_device_ip_late_init failed [ 7.014224][ 7] [ T295] amdgpu >>>>> 0000:04:00.0: Fatal error during GPU init >>>> Memory barries are not supposed to be sprinkled around like this, you >> need to give a detailed explanation why this is necessary. >>>> >>>> Regards, >>>> Christian. >>>> >>>>> Signed-off-by: Zhenneng Li <lizhenneng@...inos.cn> >>>>> --- >>>>> drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++ >>>>> 1 file changed, 2 insertions(+) >>>>> >>>>> diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c >>>>> b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c >>>>> index 8f994ffa9cd1..c7656f22278d 100644 >>>>> --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c >>>>> +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c >>>>> @@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct >>>>> amdgpu_device *adev) >>>>> u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL); >>>>> u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0); >>>>> + mb(); >>>>> + >>>>> if (!(rst & RST_REG) && !(clk & CK_DISABLE)) >>>>> return true; >>> In particular, it makes no sense in this specific place, since it cannot directly >> affect the values of rst & clk. >> >> I thinks so too. >> >> But when I do reboot test using nine desktop machines, there maybe report >> this error on one or two machines after Hundreds of times or Thousands of >> times reboot test, at the beginning, I use msleep() instead of mb(), these >> two methods are all works, but I don't know what is the root case. >> >> I use this method on other verdor's oland card, this error message are >> reported again. >> >> What could be the root reason? >> >> test environmen: >> >> graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87 >> >> driver: amdgpu >> >> os: ubuntu 2004 >> >> platform: arm64 >> >> kernel: 5.4.18 >> >>>
Powered by blists - more mailing lists