lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8acb0860-1c9d-4fb1-8ec5-ab2104dcb7b7@amd.com>
Date: Mon, 15 Jan 2024 11:20:23 +0100
From: Christian König <christian.koenig@....com>
To: Thomas Perrot <thomas.perrot@...tlin.com>, alexander.deucher@....com,
 Xinhui.Pan@....com, lijo.lazar@....com, kenneth.feng@....com,
 guchun.chen@....com, evan.quan@....com, srinivasan.shanmugam@....com
Cc: amd-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
 linux-kernel@...r.kernel.org
Subject: Re: Failed to create a rescuer kthread for the amdgpu-reset-dev
 workqueue

Am 15.01.24 um 11:17 schrieb Thomas Perrot:
> Hello Christian,
>
> On Fri, 2024-01-12 at 09:17 +0100, Christian König wrote:
>> Well the driver load is interrupted for some reason.
>>
>> Have you set any timeout for modprobe?
>>
> We don't set a modprobe timeout.

Well you somehow abort probing the driver.

This seems to be an external event and not something the driver can 
influence.

Regards,
Christian.

>
> Kind regards,
> Thomas
>
>> Regards,
>> Christian.
>>
>> Am 12.01.24 um 09:11 schrieb Thomas Perrot:
>>> Hello,
>>>
>>> We are updating the kernel from the 6.1 to the 6.6 and we observe
>>> an
>>> amdgpu’s regression with Radeon RX580 8GB and SiFive Unmatched:
>>> “workqueue: Failed to create a rescuer kthread for wq 'amdgpu-
>>> reset-
>>> dev': -EINTR
>>> [drm:amdgpu_reset_create_reset_domain [amdgpu]] *ERROR* Failed to
>>> allocate wq for amdgpu_reset_domain!
>>> amdgpu 0000:07:00.0: amdgpu: Fatal error during GPU init
>>> amdgpu 0000:07:00.0: amdgpu: amdgpu: finishing device.
>>> amdgpu: probe of 0000:07:00.0 failed with error -12”
>>>
>>> We tried to figure it out without success for the moment, do you
>>> have
>>> some advice to identify the root cause and to fix it?
>>>
>>> Kind regards,
>>> Thomas Perrot
>>>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ