[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <879b127e-2180-bc59-f522-252416a7ac01@amd.com>
Date: Tue, 12 May 2020 17:56:43 +0200
From: Christian König <christian.koenig@....com>
To: Daniel Vetter <daniel.vetter@...ll.ch>,
DRI Development <dri-devel@...ts.freedesktop.org>
Cc: LKML <linux-kernel@...r.kernel.org>, linux-media@...r.kernel.org,
linaro-mm-sig@...ts.linaro.org, linux-rdma@...r.kernel.org,
amd-gfx@...ts.freedesktop.org, intel-gfx@...ts.freedesktop.org,
Chris Wilson <chris@...is-wilson.co.uk>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Daniel Vetter <daniel.vetter@...el.com>
Subject: Re: [RFC 10/17] drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code
Hui what? Of hand that doesn't looks correct to me.
Why the heck should this be an atomic context? If that's correct
allocating memory is the least of the problems we have.
Regards,
Christian.
Am 12.05.20 um 10:59 schrieb Daniel Vetter:
> My dma-fence lockdep annotations caught an inversion because we
> allocate memory where we really shouldn't:
>
> kmem_cache_alloc+0x2b/0x6d0
> amdgpu_fence_emit+0x30/0x330 [amdgpu]
> amdgpu_ib_schedule+0x306/0x550 [amdgpu]
> amdgpu_job_run+0x10f/0x260 [amdgpu]
> drm_sched_main+0x1b9/0x490 [gpu_sched]
> kthread+0x12e/0x150
>
> Trouble right now is that lockdep only validates against GFP_FS, which
> would be good enough for shrinkers. But for mmu_notifiers we actually
> need !GFP_ATOMIC, since they can be called from any page laundering,
> even if GFP_NOFS or GFP_NOIO are set.
>
> I guess we should improve the lockdep annotations for
> fs_reclaim_acquire/release.
>
> Ofc real fix is to properly preallocate this fence and stuff it into
> the amdgpu job structure. But GFP_ATOMIC gets the lockdep splat out of
> the way.
>
> v2: Two more allocations in scheduler paths.
>
> Frist one:
>
> __kmalloc+0x58/0x720
> amdgpu_vmid_grab+0x100/0xca0 [amdgpu]
> amdgpu_job_dependency+0xf9/0x120 [amdgpu]
> drm_sched_entity_pop_job+0x3f/0x440 [gpu_sched]
> drm_sched_main+0xf9/0x490 [gpu_sched]
>
> Second one:
>
> kmem_cache_alloc+0x2b/0x6d0
> amdgpu_sync_fence+0x7e/0x110 [amdgpu]
> amdgpu_vmid_grab+0x86b/0xca0 [amdgpu]
> amdgpu_job_dependency+0xf9/0x120 [amdgpu]
> drm_sched_entity_pop_job+0x3f/0x440 [gpu_sched]
> drm_sched_main+0xf9/0x490 [gpu_sched]
>
> Cc: linux-media@...r.kernel.org
> Cc: linaro-mm-sig@...ts.linaro.org
> Cc: linux-rdma@...r.kernel.org
> Cc: amd-gfx@...ts.freedesktop.org
> Cc: intel-gfx@...ts.freedesktop.org
> Cc: Chris Wilson <chris@...is-wilson.co.uk>
> Cc: Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>
> Cc: Christian König <christian.koenig@....com>
> Signed-off-by: Daniel Vetter <daniel.vetter@...el.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 2 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 2 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 2 +-
> 3 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> index d878fe7fee51..055b47241bb1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -143,7 +143,7 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **f,
> uint32_t seq;
> int r;
>
> - fence = kmem_cache_alloc(amdgpu_fence_slab, GFP_KERNEL);
> + fence = kmem_cache_alloc(amdgpu_fence_slab, GFP_ATOMIC);
> if (fence == NULL)
> return -ENOMEM;
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> index fe92dcd94d4a..fdcd6659f5ad 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> @@ -208,7 +208,7 @@ static int amdgpu_vmid_grab_idle(struct amdgpu_vm *vm,
> if (ring->vmid_wait && !dma_fence_is_signaled(ring->vmid_wait))
> return amdgpu_sync_fence(sync, ring->vmid_wait, false);
>
> - fences = kmalloc_array(sizeof(void *), id_mgr->num_ids, GFP_KERNEL);
> + fences = kmalloc_array(sizeof(void *), id_mgr->num_ids, GFP_ATOMIC);
> if (!fences)
> return -ENOMEM;
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> index b87ca171986a..330476cc0c86 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> @@ -168,7 +168,7 @@ int amdgpu_sync_fence(struct amdgpu_sync *sync, struct dma_fence *f,
> if (amdgpu_sync_add_later(sync, f, explicit))
> return 0;
>
> - e = kmem_cache_alloc(amdgpu_sync_slab, GFP_KERNEL);
> + e = kmem_cache_alloc(amdgpu_sync_slab, GFP_ATOMIC);
> if (!e)
> return -ENOMEM;
>
Powered by blists - more mailing lists