lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADnq5_NvPsxmm8j0URD_B8a5gg9NQNX8VY0d93AqUDis46cdXA@mail.gmail.com>
Date: Fri, 18 Jul 2025 19:00:39 -0400
From: Alex Deucher <alexdeucher@...il.com>
To: Leo Li <sunpeng.li@....com>
Cc: Brian Geffon <bgeffon@...gle.com>, "Wentland, Harry" <Harry.Wentland@....com>, 
	Alex Deucher <alexander.deucher@....com>, christian.koenig@....com, 
	David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>, 
	Tvrtko Ursulin <tvrtko.ursulin@...lia.com>, Yunxiang Li <Yunxiang.Li@....com>, 
	Lijo Lazar <lijo.lazar@....com>, Prike Liang <Prike.Liang@....com>, 
	Pratap Nirujogi <pratap.nirujogi@....com>, Luben Tuikov <luben.tuikov@....com>, 
	amd-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org, 
	linux-kernel@...r.kernel.org, Garrick Evans <garrick@...gle.com>, 
	Thadeu Lima de Souza Cascardo <cascardo@...lia.com>, stable@...r.kernel.org
Subject: Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM

On Fri, Jul 18, 2025 at 6:01 PM Leo Li <sunpeng.li@....com> wrote:
>
>
>
> On 2025-07-18 17:33, Alex Deucher wrote:
> > On Fri, Jul 18, 2025 at 5:02 PM Leo Li <sunpeng.li@....com> wrote:
> >>
> >>
> >>
> >> On 2025-07-18 16:07, Alex Deucher wrote:
> >>> On Fri, Jul 18, 2025 at 1:57 PM Brian Geffon <bgeffon@...gle.com> wrote:
> >>>>
> >>>> On Thu, Jul 17, 2025 at 10:59 AM Alex Deucher <alexdeucher@...il.com> wrote:
> >>>>>
> >>>>> On Wed, Jul 16, 2025 at 8:13 PM Brian Geffon <bgeffon@...gle.com> wrote:
> >>>>>>
> >>>>>> On Wed, Jul 16, 2025 at 5:03 PM Alex Deucher <alexdeucher@...il.com> wrote:
> >>>>>>>
> >>>>>>> On Wed, Jul 16, 2025 at 12:40 PM Brian Geffon <bgeffon@...gle.com> wrote:
> >>>>>>>>
> >>>>>>>> On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher <alexdeucher@...il.com> wrote:
> >>>>>>>>>
> >>>>>>>>> On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgeffon@...gle.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> >>>>>>>>>> allowed for newer ASICs to mix GTT and VRAM, this change also noted that
> >>>>>>>>>> some older boards, such as Stoney and Carrizo do not support this.
> >>>>>>>>>> It appears that at least one additional ASIC does not support this which
> >>>>>>>>>> is Raven.
> >>>>>>>>>>
> >>>>>>>>>> We observed this issue when migrating a device from a 5.4 to 6.6 kernel
> >>>>>>>>>> and have confirmed that Raven also needs to be excluded from mixing GTT
> >>>>>>>>>> and VRAM.
> >>>>>>>>>
> >>>>>>>>> Can you elaborate a bit on what the problem is?  For carrizo and
> >>>>>>>>> stoney this is a hardware limitation (all display buffers need to be
> >>>>>>>>> in GTT or VRAM, but not both).  Raven and newer don't have this
> >>>>>>>>> limitation and we tested raven pretty extensively at the time.s
> >>>>>>>>
> >>>>>>>> Thanks for taking the time to look. We have automated testing and a
> >>>>>>>> few igt gpu tools tests failed and after debugging we found that
> >>>>>>>> commit 81d0bcf99009 is what introduced the failures on this hardware
> >>>>>>>> on 6.1+ kernels. The specific tests that fail are kms_async_flips and
> >>>>>>>> kms_plane_alpha_blend, excluding Raven from this sharing of GTT and
> >>>>>>>> VRAM buffers resolves the issue.
> >>>>>>>
> >>>>>>> + Harry and Leo
> >>>>>>>
> >>>>>>> This sounds like the memory placement issue we discussed last week.
> >>>>>>> In that case, the issue is related to where the buffer ends up when we
> >>>>>>> try to do an async flip.  In that case, we can't do an async flip
> >>>>>>> without a full modeset if the buffers locations are different than the
> >>>>>>> last modeset because we need to update more than just the buffer base
> >>>>>>> addresses.  This change works around that limitation by always forcing
> >>>>>>> display buffers into VRAM or GTT.  Adding raven to this case may fix
> >>>>>>> those tests but will make the overall experience worse because we'll
> >>>>>>> end up effectively not being able to not fully utilize both gtt and
> >>>>>>> vram for display which would reintroduce all of the problems fixed by
> >>>>>>> 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)").
> >>>>>>
> >>>>>> Thanks Alex, the thing is, we only observe this on Raven boards, why
> >>>>>> would Raven only be impacted by this? It would seem that all devices
> >>>>>> would have this issue, no? Also, I'm not familiar with how
> >>>>>
> >>>>> It depends on memory pressure and available memory in each pool.
> >>>>> E.g., initially the display buffer is in VRAM when the initial mode
> >>>>> set happens.  The watermarks, etc. are set for that scenario.  One of
> >>>>> the next frames ends up in a pool different than the original.  Now
> >>>>> the buffer is in GTT.  The async flip interface does a fast validation
> >>>>> to try and flip as soon as possible, but that validation fails because
> >>>>> the watermarks need to be updated which requires a full modeset.
> >>
> >> Huh, I'm not sure if this actually is an issue for APUs. The fix that introduced
> >> a check for same memory placement on async flips was on a system with a DGPU,
> >> for which VRAM placement does matter:
> >> https://github.com/torvalds/linux/commit/a7c0cad0dc060bb77e9c9d235d68441b0fc69507
> >>
> >> Looking around in DM/DML, for APUs, I don't see any logic that changes DCN
> >> bandwidth validation depending on memory placement. There's a gpuvm_enable flag
> >> for SG, but it's statically set to 1 on APU DCN versions. It sounds like for
> >> APUs specifically, we *should* be able to ignore the mem placement check. I can
> >> spin up a patch to test this out.
> >
> > Is the gpu_vm_support flag ever set for dGPUs?  The allowed domains
> > for display buffers are determined by
> > amdgpu_display_supported_domains() and we only allow GTT as a domain
> > if gpu_vm_support is set, which I think is just for APUs.  In that
> > case, we could probably only need the checks specifically for
> > CHIP_CARRIZO and CHIP_STONEY since IIRC, they don't support mixed VRAM
> > and GTT (only one or the other?).  dGPUs and really old APUs will
> > always get VRAM, and newer APUs will get VRAM | GTT.
>
> It doesn't look like gpu_vm_support is set for DGPUs
> https://elixir.bootlin.com/linux/v6.15.6/source/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c#L1866
>
> Though interestingly, further up at #L1858, Raven has gpu_vm_support = 0. Maybe it had stability issues?
> https://github.com/torvalds/linux/commit/098c13079c6fdd44f10586b69132c392ebf87450

We need to be a little careful here asic_type == CHIP_RAVEN covers
several variants:
apu_flags & AMD_APU_IS_RAVEN - raven1 (gpu_vm_support = false)
apu_flags & AMD_APU_IS_RAVEN2 - raven2 (gpu_vm_support = true)
apu_flags & AMD_APU_IS_PICASSO - picasso (gpu_vm_support = true)

amdgpu_display_supported_domains() only sets AMDGPU_GEM_DOMAIN_GTT if
gpu_vm_support is true.  so we'd never get into the check in
amdgpu_bo_get_preferred_domain() for raven1.

Anyway, back to your suggestion, I think we can probably drop the
checks as you should always get a compatible memory buffer due to
amdgpu_bo_get_preferred_domain(). Pinning should fail if we can't pin
in the required domain.  amdgpu_display_supported_domains() will
ensure you always get VRAM or GTT or VRAM | GTT depending on what the
chip supports.  Then amdgpu_bo_get_preferred_domain() will either
leave that as is, or force VRAM or GTT for the STONEY/CARRIZO case.
On the off chance we do get incompatible memory, something like the
attached patch should do the trick.

Alex


>
> - Leo
>
> >
> > Alex
> >
> >>
> >> Thanks,
> >> Leo
> >>
> >>>>>
> >>>>> It's tricky to fix because you don't want to use the worst case
> >>>>> watermarks all the time because that will limit the number available
> >>>>> display options and you don't want to force everything to a particular
> >>>>> memory pool because that will limit the amount of memory that can be
> >>>>> used for display (which is what the patch in question fixed).  Ideally
> >>>>> the caller would do a test commit before the page flip to determine
> >>>>> whether or not it would succeed before issuing it and then we'd have
> >>>>> some feedback mechanism to tell the caller that the commit would fail
> >>>>> due to buffer placement so it would do a full modeset instead.  We
> >>>>> discussed this feedback mechanism last week at the display hackfest.
> >>>>>
> >>>>>
> >>>>>> kms_plane_alpha_blend works, but does this also support that test
> >>>>>> failing as the cause?
> >>>>>
> >>>>> That may be related.  I'm not too familiar with that test either, but
> >>>>> Leo or Harry can provide some guidance.
> >>>>>
> >>>>> Alex
> >>>>
> >>>> Thanks everyone for the input so far. I have a question for the
> >>>> maintainers, given that it seems that this is functionally broken for
> >>>> ASICs which are iGPUs, and there does not seem to be an easy fix, does
> >>>> it make sense to extend this proposed patch to all iGPUs until a more
> >>>> permanent fix can be identified? At the end of the day I'll take
> >>>> functional correctness over performance.
> >>>
> >>> It's not functional correctness, it's usability.  All that is
> >>> potentially broken is async flips (which depend on memory pressure and
> >>> buffer placement), while if you effectively revert the patch, you end
> >>> up  limiting all display buffers to either VRAM or GTT which may end
> >>> up causing the inability to display anything because there is not
> >>> enough memory in that pool for the next modeset.  We'll start getting
> >>> bug reports about blank screens and failure to set modes because of
> >>> memory pressure.  I think if we want a short term fix, it would be to
> >>> always set the worst case watermarks.  The downside to that is that it
> >>> would possibly cause some working display setups to stop working if
> >>> they were on the margins to begin with.
> >>>
> >>> Alex
> >>>
> >>>>
> >>>> Brian
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Thanks again,
> >>>>>> Brian
> >>>>>>
> >>>>>>>
> >>>>>>> Alex
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Brian
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Alex
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> >>>>>>>>>> Cc: Luben Tuikov <luben.tuikov@....com>
> >>>>>>>>>> Cc: Christian König <christian.koenig@....com>
> >>>>>>>>>> Cc: Alex Deucher <alexander.deucher@....com>
> >>>>>>>>>> Cc: stable@...r.kernel.org # 6.1+
> >>>>>>>>>> Tested-by: Thadeu Lima de Souza Cascardo <cascardo@...lia.com>
> >>>>>>>>>> Signed-off-by: Brian Geffon <bgeffon@...gle.com>
> >>>>>>>>>> ---
> >>>>>>>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
> >>>>>>>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>>>>>>>> index 73403744331a..5d7f13e25b7c 100644
> >>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>>>>>>>> @@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
> >>>>>>>>>>                                             uint32_t domain)
> >>>>>>>>>>  {
> >>>>>>>>>>         if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
> >>>>>>>>>> -           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY))) {
> >>>>>>>>>> +           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY) ||
> >>>>>>>>>> +            (adev->asic_type == CHIP_RAVEN))) {
> >>>>>>>>>>                 domain = AMDGPU_GEM_DOMAIN_VRAM;
> >>>>>>>>>>                 if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
> >>>>>>>>>>                         domain = AMDGPU_GEM_DOMAIN_GTT;
> >>>>>>>>>> --
> >>>>>>>>>> 2.50.0.727.gbf7dc18ff4-goog
> >>>>>>>>>>
> >>
>

View attachment "0001-drm-amd-display-refine-framebuffer-placement-checks.patch" of type "text/x-patch" (3966 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ