[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3cc0d360-8f51-4cdd-90fd-1fa0a199c2ba@amd.com>
Date: Mon, 17 Jun 2024 17:53:21 +0200
From: Christian König <christian.koenig@....com>
To: Xi Ruoyao <xry111@...111.site>, Icenowy Zheng <uwu@...nowy.me>,
Alex Deucher <alexander.deucher@....com>, Pan Xinhui <Xinhui.Pan@....com>,
David Airlie <airlied@...il.com>, Daniel Vetter <daniel@...ll.ch>,
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@....com>,
Huacai Chen <chenhuacai@...nel.org>, WANG Xuerui <kernel@...0n.name>
Cc: amd-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
linux-kernel@...r.kernel.org, loongarch@...ts.linux.dev
Subject: Re: [PATCH 1/2] drm/amdgpu: make duplicated EOP packet for GFX7/8
have real content
Am 17.06.24 um 17:35 schrieb Xi Ruoyao:
> On Mon, 2024-06-17 at 22:30 +0800, Icenowy Zheng wrote:
>>> Two consecutive writes to the same bus address are perfectly legal
>>> from
>>> the PCIe specification and can happen all the time, even without this
>>> specific hw workaround.
>> Yes I know it, and I am not from Loongson, just some user trying to
>> mess around it.
> There are some purposed "workarounds" like reducing the link speed (from
> x16 to x8), tweaking the power management setting, etc. Someone even
> claims improving the heat sink of the LS7A chip can help to work around
> this issue but I'm really skeptical...
Well when it's an ordering problem between writes and interrupts then
nothing else than getting the order right will fix this. Otherwise it
can always be that the CPU doesn't see coherent results from PCIe devices.
In other words if the CPU gets an interrupt but doesn't sees the fence
value written it will assume the work is not done. But since the
hardware won't trigger a second interrupt the CPU will then keep waiting
for the operation to finish forever.
This is not limited to GPUs, but will potentially happen with network or
disk I/O as well.
Regards,
Christian.
Powered by blists - more mailing lists