lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 5 Jul 2023 14:27:03 +0200
From:   Christian König <christian.koenig@....com>
To:     Arnd Bergmann <arnd@...db.de>, Arnd Bergmann <arnd@...nel.org>,
        Alex Deucher <alexander.deucher@....com>,
        "Pan, Xinhui" <Xinhui.Pan@....com>,
        Dave Airlie <airlied@...il.com>,
        Daniel Vetter <daniel@...ll.ch>
Cc:     Hawking Zhang <Hawking.Zhang@....com>,
        Lijo Lazar <lijo.lazar@....com>,
        Mario Limonciello <mario.limonciello@....com>,
        YiPeng Chai <YiPeng.Chai@....com>, Le Ma <le.ma@....com>,
        Bokun Zhang <Bokun.Zhang@....com>,
        Srinivasan Shanmugam <srinivasan.shanmugam@....com>,
        Shiwu Zhang <shiwu.zhang@....com>,
        amd-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] drm/amdgpu: avoid integer overflow warning in
 amdgpu_device_resize_fb_bar()

Am 04.07.23 um 17:24 schrieb Arnd Bergmann:
> On Tue, Jul 4, 2023, at 16:51, Christian König wrote:
>> Am 04.07.23 um 16:31 schrieb Arnd Bergmann:
>>> On Tue, Jul 4, 2023, at 14:33, Christian König wrote:
>>>> Modern AMD GPUs have 16GiB of local memory (VRAM), making those
>>>> accessible to a CPU which can only handle 32bit addresses by resizing
>>>> the BAR is impossible to begin with.
>>>>
>>>> But going a step further even without resizing it is pretty hard to get
>>>> that hardware working on such an architecture.
>>> I'd still like to understand this part better, as we have a lot of
>>> arm64 chips with somewhat flawed PCIe implementations, often with
>>> a tiny 64-bit memory space, but otherwise probably capable of
>>> using a GPU.
>> Yeah, those are unfortunately very well known to us :(
>>
>>> What exactly do you expect to happen here?
>>>
>>> a) Use only part of the VRAM but otherwise work as expected
>>> b) Access all of the VRAM, but at a performance cost for
>>>      bank switching?
>> We have tons of x86 systems where we can't resize the BAR (because of
>> lack of BIOS setup of the root PCIe windows). So bank switching is still
>> perfectly supported.
> Ok, good.
>
>> After investigating (which sometimes even includes involving engineers
>> from ARM) we usually find that those boards doesn't even remotely comply
>> to the PCIe specification, both regarding power as well as functional
>> things like DMA coherency.
> Makes sense, the power usage is clearly going to make this
> impossible on a lot of boards. I would have expected noncoherent
> DMA to be a solvable problem, since that generally works with
> all drivers that use the dma-mapping interfaces correctly,
> but I understand that drivers/gpu/* often does its own thing
> here, which may make that harder.

Yeah, I've heard that before. The problem is simply that the dma-mapping 
interface can't handle those cases.

User space APIs like Vulkan and some OpenGL extensions make a coherent 
memory model between GPU and CPU mandatory.

In other words you have things like ring buffers between code running on 
the GPU and code running on the CPU and the kernel is not even involved 
in that communication.

This is all based on the PCIe specification which makes it quite clear 
that things like snooping caches is mandatory for a compliant root complex.

There has been success to some degree by making everything uncached, but 
then the performance just sucks so badly that you can practically forget 
it as well.

Regards,
Christian.

>
>       Arnd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ