linux-kernel - Re: [PATCH] drm/amdgpu: cache in more vm fault information

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6974b239-5f67-4eca-8371-5d28464072f9@amd.com>
Date: Wed, 6 Mar 2024 21:29:56 +0530
From: "Khatri, Sunil" <sukhatri@....com>
To: Alex Deucher <alexdeucher@...il.com>
Cc: Christian König <ckoenig.leichtzumerken@...il.com>,
 Christian König <christian.koenig@....com>,
 Sunil Khatri <sunil.khatri@....com>, Alex Deucher
 <alexander.deucher@....com>, Shashank Sharma <shashank.sharma@....com>,
 amd-gfx@...ts.freedesktop.org, Pan@...-sunil-navi33.amd.com,
 Xinhui <Xinhui.Pan@....com>, dri-devel@...ts.freedesktop.org,
 linux-kernel@...r.kernel.org, Mukul Joshi <mukul.joshi@....com>,
 Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@....com>
Subject: Re: [PATCH] drm/amdgpu: cache in more vm fault information


On 3/6/2024 9:19 PM, Alex Deucher wrote:
> On Wed, Mar 6, 2024 at 10:32 AM Alex Deucher <alexdeucher@...il.com> wrote:
>> On Wed, Mar 6, 2024 at 10:13 AM Khatri, Sunil <sukhatri@....com> wrote:
>>>
>>> On 3/6/2024 8:34 PM, Christian König wrote:
>>>> Am 06.03.24 um 15:29 schrieb Alex Deucher:
>>>>> On Wed, Mar 6, 2024 at 8:04 AM Khatri, Sunil <sukhatri@....com> wrote:
>>>>>> On 3/6/2024 6:12 PM, Christian König wrote:
>>>>>>> Am 06.03.24 um 11:40 schrieb Khatri, Sunil:
>>>>>>>> On 3/6/2024 3:37 PM, Christian König wrote:
>>>>>>>>> Am 06.03.24 um 10:04 schrieb Sunil Khatri:
>>>>>>>>>> When an  page fault interrupt is raised there
>>>>>>>>>> is a lot more information that is useful for
>>>>>>>>>> developers to analyse the pagefault.
>>>>>>>>> Well actually those information are not that interesting because
>>>>>>>>> they are hw generation specific.
>>>>>>>>>
>>>>>>>>> You should probably rather use the decoded strings here, e.g. hub,
>>>>>>>>> client, xcc_id, node_id etc...
>>>>>>>>>
>>>>>>>>> See gmc_v9_0_process_interrupt() an example.
>>>>>>>>> I saw this v9 does provide more information than what v10 and v11
>>>>>>>>> provide like node_id and fault from which die but thats again very
>>>>>>>>> specific to IP_VERSION(9, 4, 3)) i dont know why thats information
>>>>>>>>> is not there in v10 and v11.
>>>>>>>> I agree to your point but, as of now during a pagefault we are
>>>>>>>> dumping this information which is useful like which client
>>>>>>>> has generated an interrupt and for which src and other information
>>>>>>>> like address. So i think to provide the similar information in the
>>>>>>>> devcoredump.
>>>>>>>>
>>>>>>>> Currently we do not have all this information from either job or vm
>>>>>>>> being derived from the job during a reset. We surely could add more
>>>>>>>> relevant information later on as per request but this information is
>>>>>>>> useful as
>>>>>>>> eventually its developers only who would use the dump file provided
>>>>>>>> by customer to debug.
>>>>>>>>
>>>>>>>> Below is the information that i dump in devcore and i feel that is
>>>>>>>> good information but new information could be added which could be
>>>>>>>> picked later.
>>>>>>>>
>>>>>>>>> Page fault information
>>>>>>>>> [gfxhub] page fault (src_id:0 ring:24 vmid:3 pasid:32773)
>>>>>>>>> in page starting at address 0x0000000000000000 from client 0x1b
>>>>>>>>> (UTCL2)
>>>>>>> This is a perfect example what I mean. You record in the patch is the
>>>>>>> client_id, but this is is basically meaningless unless you have access
>>>>>>> to the AMD internal hw documentation.
>>>>>>>
>>>>>>> What you really need is the client in decoded form, in this case
>>>>>>> UTCL2. You can keep the client_id additionally, but the decoded client
>>>>>>> string is mandatory to have I think.
>>>>>>>
>>>>>>> Sure i am capturing that information as i am trying to minimise the
>>>>>>> memory interaction to minimum as we are still in interrupt context
>>>>>>> here that why i recorded the integer information compared to decoding
>>>>>> and writing strings there itself but to postpone till we dump.
>>>>>>
>>>>>> Like decoding to the gfxhub/mmhub based on vmhub/vmid_src and client
>>>>>> string from client id. So are we good to go with the information with
>>>>>> the above information of sharing details in devcoredump using the
>>>>>> additional information from pagefault cached.
>>>>> I think amdgpu_vm_fault_info() has everything you need already (vmhub,
>>>>> status, and addr).  client_id and src_id are just tokens in the
>>>>> interrupt cookie so we know which IP to route the interrupt to. We
>>>>> know what they will be because otherwise we'd be in the interrupt
>>>>> handler for a different IP.  I don't think ring_id has any useful
>>>>> information in this context and vmid and pasid are probably not too
>>>>> useful because they are just tokens to associate the fault with a
>>>>> process.  It would be better to have the process name.
>>> Just to share context here Alex, i am preparing this for devcoredump, my
>>> intention was to replicate the information which in KMD we are sharing
>>> in Dmesg for page faults. If assuming we do not add client id specially
>>> we would not be able to share enough information in devcoredump.
>>> It would be just address and hub(gfxhub/mmhub) and i think that is
>>> partial information as src id and client id and ip block shares good
>>> information.
>> We also need to include the status register value.  That contains the
>> important information (type of access, fault type, client, etc.).
>> Client_id and src_id are only used to route the interrupt to the right
>> software code.  E.g., a different client_id and src_id would be a
>> completely different interrupt (e.g., vblank or fence, etc.).  For GPU
>> page faults the client_id and src_id will always be the same.
>>
>> The devcoredump should also include information about the GPU itself
>> as well (e.g., PCI DID/VID, maybe some of the relevant IP versions).
> We already have "status" which is register "GCVM_L2_PROTECTION_FAULT_STATUS". But the problem here is this all needs to be captured in interrupt context which i want to avoid and this is family specific calls.
>
> chip family would also be good.  And also vram size.
>
> If we have a way to identify the chip and we have the vm status
> register and vm fault address, we can decode all of the fault
> information.
>
> In this patch i am focusing on page fault specific information only[taking one at a time]. But eventually will be adding more information as per the devcoredump JIRA plan. will keep this in todo too for other information that you suggested.

Regards

Sunil

> Alex
>
>> Alex
>>
>>> For process related information we are capturing that information part
>>> of dump from existing functionality.
>>> **** AMDGPU Device Coredump ****
>>> version: 1
>>> kernel: 6.7.0-amd-staging-drm-next
>>> module: amdgpu
>>> time: 45.084775181
>>> process_name: soft_recovery_p PID: 1780
>>>
>>> Ring timed out details
>>> IP Type: 0 Ring Name: gfx_0.0.0
>>>
>>> Page fault information
>>> [gfxhub] page fault (src_id:0 ring:24 vmid:3 pasid:32773)
>>> in page starting at address 0x0000000000000000 from client 0x1b (UTCL2)
>>> VRAM is lost due to GPU reset!
>>>
>>> Regards
>>> Sunil
>>>
>>>> The decoded client name would be really useful I think since the fault
>>>> handled is a catch all and handles a whole bunch of different clients.
>>>>
>>>> But that should be ideally passed in as const string instead of the hw
>>>> generation specific client_id.
>>>>
>>>> As long as it's only a pointer we also don't run into the trouble that
>>>> we need to allocate memory for it.
>>> I agree but i prefer adding the client id and decoding it in devcorecump
>>> using soc15_ih_clientid_name[fault_info->client_id]) is better else we
>>> have to do an sprintf this string to fault_info in irq context which is
>>> writing more bytes to memory i guess compared to an integer:)
>>>
>>> We can argue on values like pasid and vmid and ring id to be taken off
>>> if they are totally not useful.
>>>
>>> Regards
>>> Sunil
>>>
>>>> Christian.
>>>>
>>>>> Alex
>>>>>
>>>>>> regards
>>>>>> sunil
>>>>>>
>>>>>>> Regards,
>>>>>>> Christian.
>>>>>>>
>>>>>>>> Regards
>>>>>>>> Sunil Khatri
>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Christian.
>>>>>>>>>
>>>>>>>>>> Add all such information in the last cached
>>>>>>>>>> pagefault from an interrupt handler.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Sunil Khatri <sunil.khatri@....com>
>>>>>>>>>> ---
>>>>>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 +++++++--
>>>>>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 7 ++++++-
>>>>>>>>>>     drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 2 +-
>>>>>>>>>>     drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 2 +-
>>>>>>>>>>     drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  | 2 +-
>>>>>>>>>>     drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  | 2 +-
>>>>>>>>>>     drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 2 +-
>>>>>>>>>>     7 files changed, 18 insertions(+), 8 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>> index 4299ce386322..b77e8e28769d 100644
>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>> @@ -2905,7 +2905,7 @@ void amdgpu_debugfs_vm_bo_info(struct
>>>>>>>>>> amdgpu_vm *vm, struct seq_file *m)
>>>>>>>>>>      * Cache the fault info for later use by userspace in debugging.
>>>>>>>>>>      */
>>>>>>>>>>     void amdgpu_vm_update_fault_cache(struct amdgpu_device *adev,
>>>>>>>>>> -                  unsigned int pasid,
>>>>>>>>>> +                  struct amdgpu_iv_entry *entry,
>>>>>>>>>>                       uint64_t addr,
>>>>>>>>>>                       uint32_t status,
>>>>>>>>>>                       unsigned int vmhub)
>>>>>>>>>> @@ -2915,7 +2915,7 @@ void amdgpu_vm_update_fault_cache(struct
>>>>>>>>>> amdgpu_device *adev,
>>>>>>>>>> xa_lock_irqsave(&adev->vm_manager.pasids, flags);
>>>>>>>>>>     -    vm = xa_load(&adev->vm_manager.pasids, pasid);
>>>>>>>>>> +    vm = xa_load(&adev->vm_manager.pasids, entry->pasid);
>>>>>>>>>>         /* Don't update the fault cache if status is 0.  In the
>>>>>>>>>> multiple
>>>>>>>>>>          * fault case, subsequent faults will return a 0 status
>>>>>>>>>> which is
>>>>>>>>>>          * useless for userspace and replaces the useful fault
>>>>>>>>>> status, so
>>>>>>>>>> @@ -2924,6 +2924,11 @@ void amdgpu_vm_update_fault_cache(struct
>>>>>>>>>> amdgpu_device *adev,
>>>>>>>>>>         if (vm && status) {
>>>>>>>>>>             vm->fault_info.addr = addr;
>>>>>>>>>>             vm->fault_info.status = status;
>>>>>>>>>> +        vm->fault_info.client_id = entry->client_id;
>>>>>>>>>> +        vm->fault_info.src_id = entry->src_id;
>>>>>>>>>> +        vm->fault_info.vmid = entry->vmid;
>>>>>>>>>> +        vm->fault_info.pasid = entry->pasid;
>>>>>>>>>> +        vm->fault_info.ring_id = entry->ring_id;
>>>>>>>>>>             if (AMDGPU_IS_GFXHUB(vmhub)) {
>>>>>>>>>>                 vm->fault_info.vmhub = AMDGPU_VMHUB_TYPE_GFX;
>>>>>>>>>>                 vm->fault_info.vmhub |=
>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>> index 047ec1930d12..c7782a89bdb5 100644
>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>> @@ -286,6 +286,11 @@ struct amdgpu_vm_fault_info {
>>>>>>>>>>         uint32_t    status;
>>>>>>>>>>         /* which vmhub? gfxhub, mmhub, etc. */
>>>>>>>>>>         unsigned int    vmhub;
>>>>>>>>>> +    unsigned int    client_id;
>>>>>>>>>> +    unsigned int    src_id;
>>>>>>>>>> +    unsigned int    ring_id;
>>>>>>>>>> +    unsigned int    pasid;
>>>>>>>>>> +    unsigned int    vmid;
>>>>>>>>>>     };
>>>>>>>>>>       struct amdgpu_vm {
>>>>>>>>>> @@ -605,7 +610,7 @@ static inline void
>>>>>>>>>> amdgpu_vm_eviction_unlock(struct amdgpu_vm *vm)
>>>>>>>>>>     }
>>>>>>>>>>       void amdgpu_vm_update_fault_cache(struct amdgpu_device *adev,
>>>>>>>>>> -                  unsigned int pasid,
>>>>>>>>>> +                  struct amdgpu_iv_entry *entry,
>>>>>>>>>>                       uint64_t addr,
>>>>>>>>>>                       uint32_t status,
>>>>>>>>>>                       unsigned int vmhub);
>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>>>>>>>>> index d933e19e0cf5..6b177ce8db0e 100644
>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>>>>>>>>> @@ -150,7 +150,7 @@ static int gmc_v10_0_process_interrupt(struct
>>>>>>>>>> amdgpu_device *adev,
>>>>>>>>>>             status = RREG32(hub->vm_l2_pro_fault_status);
>>>>>>>>>>             WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1);
>>>>>>>>>>     -        amdgpu_vm_update_fault_cache(adev, entry->pasid, addr,
>>>>>>>>>> status,
>>>>>>>>>> +        amdgpu_vm_update_fault_cache(adev, entry, addr, status,
>>>>>>>>>>                              entry->vmid_src ? AMDGPU_MMHUB0(0) :
>>>>>>>>>> AMDGPU_GFXHUB(0));
>>>>>>>>>>         }
>>>>>>>>>>     diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
>>>>>>>>>> index 527dc917e049..bcf254856a3e 100644
>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
>>>>>>>>>> @@ -121,7 +121,7 @@ static int gmc_v11_0_process_interrupt(struct
>>>>>>>>>> amdgpu_device *adev,
>>>>>>>>>>             status = RREG32(hub->vm_l2_pro_fault_status);
>>>>>>>>>>             WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1);
>>>>>>>>>>     -        amdgpu_vm_update_fault_cache(adev, entry->pasid, addr,
>>>>>>>>>> status,
>>>>>>>>>> +        amdgpu_vm_update_fault_cache(adev, entry, addr, status,
>>>>>>>>>>                              entry->vmid_src ? AMDGPU_MMHUB0(0) :
>>>>>>>>>> AMDGPU_GFXHUB(0));
>>>>>>>>>>         }
>>>>>>>>>>     diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>>>>>>>>>> index 3da7b6a2b00d..e9517ebbe1fd 100644
>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>>>>>>>>>> @@ -1270,7 +1270,7 @@ static int gmc_v7_0_process_interrupt(struct
>>>>>>>>>> amdgpu_device *adev,
>>>>>>>>>>         if (!addr && !status)
>>>>>>>>>>             return 0;
>>>>>>>>>>     -    amdgpu_vm_update_fault_cache(adev, entry->pasid,
>>>>>>>>>> +    amdgpu_vm_update_fault_cache(adev, entry,
>>>>>>>>>>                          ((u64)addr) << AMDGPU_GPU_PAGE_SHIFT,
>>>>>>>>>> status, AMDGPU_GFXHUB(0));
>>>>>>>>>>           if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST)
>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>>>>>>>>> index d20e5f20ee31..a271bf832312 100644
>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>>>>>>>>> @@ -1438,7 +1438,7 @@ static int gmc_v8_0_process_interrupt(struct
>>>>>>>>>> amdgpu_device *adev,
>>>>>>>>>>         if (!addr && !status)
>>>>>>>>>>             return 0;
>>>>>>>>>>     -    amdgpu_vm_update_fault_cache(adev, entry->pasid,
>>>>>>>>>> +    amdgpu_vm_update_fault_cache(adev, entry,
>>>>>>>>>>                          ((u64)addr) << AMDGPU_GPU_PAGE_SHIFT,
>>>>>>>>>> status, AMDGPU_GFXHUB(0));
>>>>>>>>>>           if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST)
>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>>>>>>> index 47b63a4ce68b..dc9fb1fb9540 100644
>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>>>>>>> @@ -666,7 +666,7 @@ static int gmc_v9_0_process_interrupt(struct
>>>>>>>>>> amdgpu_device *adev,
>>>>>>>>>>         rw = REG_GET_FIELD(status, VM_L2_PROTECTION_FAULT_STATUS,
>>>>>>>>>> RW);
>>>>>>>>>>         WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1);
>>>>>>>>>>     -    amdgpu_vm_update_fault_cache(adev, entry->pasid, addr,
>>>>>>>>>> status, vmhub);
>>>>>>>>>> +    amdgpu_vm_update_fault_cache(adev, entry, addr, status,
>>>>>>>>>> vmhub);
>>>>>>>>>>           dev_err(adev->dev,
>>>>>>>>>>             "VM_L2_PROTECTION_FAULT_STATUS:0x%08X\n",