lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3297a59b-a788-43aa-945d-e89592c9ba8d@intel.com>
Date: Fri, 16 Jan 2026 11:26:36 +0530
From: Riana Tauro <riana.tauro@...el.com>
To: Zack McKevitt <zachary.mckevitt@....qualcomm.com>, Rodrigo Vivi
	<rodrigo.vivi@...el.com>
CC: Jakub Kicinski <kuba@...nel.org>, <intel-xe@...ts.freedesktop.org>,
	<dri-devel@...ts.freedesktop.org>, <aravind.iddamsetty@...ux.intel.com>,
	<anshuman.gupta@...el.com>, <joonas.lahtinen@...ux.intel.com>,
	<lukas@...ner.de>, <simona.vetter@...ll.ch>, <airlied@...il.com>,
	<pratik.bari@...el.com>, <joshua.santosh.ranjan@...el.com>,
	<ashwin.kumar.kulkarni@...el.com>, <shubham.kumar@...el.com>, Lijo Lazar
	<lijo.lazar@....com>, Hawking Zhang <Hawking.Zhang@....com>, "David S.
 Miller" <davem@...emloft.net>, Paolo Abeni <pabeni@...hat.com>, Eric Dumazet
	<edumazet@...gle.com>, <netdev@...r.kernel.org>, Jeff Hugo
	<jeff.hugo@....qualcomm.com>
Subject: Re: [PATCH v3 1/4] drm/ras: Introduce the DRM RAS infrastructure over
 generic netlink



On 1/16/2026 5:09 AM, Zack McKevitt wrote:
> 
> 
> On 1/13/2026 1:20 AM, Riana Tauro wrote:
>>>>>> diff --git a/Documentation/netlink/specs/drm_ras.yaml b/ 
>>>>>> Documentation/netlink/specs/drm_ras.yaml
>>>>>> new file mode 100644
>>>>>> index 000000000000..be0e379c5bc9
>>>>>> --- /dev/null
>>>>>> +++ b/Documentation/netlink/specs/drm_ras.yaml
>>>>>> @@ -0,0 +1,130 @@
>>>>>> +# SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR 
>>>>>> BSD-3-Clause)
>>>>>> +---
>>>>>> +name: drm-ras
>>>>>> +protocol: genetlink
>>>>>> +uapi-header: drm/drm_ras.h
>>>>>> +
>>>>>> +doc: >-
>>>>>> +  DRM RAS (Reliability, Availability, Serviceability) over 
>>>>>> Generic Netlink.
>>>>>> +  Provides a standardized mechanism for DRM drivers to register 
>>>>>> "nodes"
>>>>>> +  representing hardware/software components capable of reporting 
>>>>>> error counters.
>>>>>> +  Userspace tools can query the list of nodes or individual error 
>>>>>> counters
>>>>>> +  via the Generic Netlink interface.
>>>>>> +
>>>>>> +definitions:
>>>>>> +  -
>>>>>> +    type: enum
>>>>>> +    name: node-type
>>>>>> +    value-start: 1
>>>>>> +    entries: [error-counter]
>>>>>> +    doc: >-
>>>>>> +         Type of the node. Currently, only error-counter nodes are
>>>>>> +         supported, which expose reliability counters for a 
>>>>>> hardware/software
>>>>>> +         component.
>>>>>> +
>>>>>> +attribute-sets:
>>>>>> +  -
>>>>>> +    name: node-attrs
>>>>>> +    attributes:
>>>>>> +      -
>>>>>> +        name: node-id
>>>>>> +        type: u32
>>>>>> +        doc: >-
>>>>>> +             Unique identifier for the node.
>>>>>> +             Assigned dynamically by the DRM RAS core upon 
>>>>>> registration.
>>>>>> +      -
>>>>>> +        name: device-name
>>>>>> +        type: string
>>>>>> +        doc: >-
>>>>>> +             Device name chosen by the driver at registration.
>>>>>> +             Can be a PCI BDF, UUID, or module name if unique.
>>>>>> +      -
>>>>>> +        name: node-name
>>>>>> +        type: string
>>>>>> +        doc: >-
>>>>>> +             Node name chosen by the driver at registration.
>>>>>> +             Can be an IP block name, or any name that identifies 
>>>>>> the
>>>>>> +             RAS node inside the device.
>>>>>> +      -
>>>>>> +        name: node-type
>>>>>> +        type: u32
>>>>>> +        doc: Type of this node, identifying its function.
>>>>>> +        enum: node-type
>>>>>> +  -
>>>>>> +    name: error-counter-attrs
>>>>>> +    attributes:
>>>>>> +      -
>>>>>> +        name: node-id
>>>>>> +        type: u32
>>>>>> +        doc:  Node ID targeted by this error counter operation.
>>>>>> +      -
>>>>>> +        name: error-id
>>>>>> +        type: u32
>>>>>> +        doc: Unique identifier for a specific error counter 
>>>>>> within an node.
>>>>>> +      -
>>>>>> +        name: error-name
>>>>>> +        type: string
>>>>>> +        doc: Name of the error.
>>>>>> +      -
>>>>>> +        name: error-value
>>>>>> +        type: u32
>>>>>> +        doc: Current value of the requested error counter.
>>>>>> +
>>>>>> +operations:
>>>>>> +  list:
>>>>>> +    -
>>>>>> +      name: list-nodes
>>>>>> +      doc: >-
>>>>>> +           Retrieve the full list of currently registered DRM RAS 
>>>>>> nodes.
>>>>>> +           Each node includes its dynamically assigned ID, name, 
>>>>>> and type.
>>>>>> +           **Important:** User space must call this operation 
>>>>>> first to obtain
>>>>>> +           the node IDs. These IDs are required for all subsequent
>>>>>> +           operations on nodes, such as querying error counters.
>>>>
>>>> I am curious about security implications of this design.
>>>
>>> hmm... very good point you are raising here.
>>>
>>> This current design relies entirely in the CAP_NET_ADMIN.
>>> No driver would have the flexibility to choose anything differently.
>>> Please notice that the flag admin-perm is hardcoded in this yaml file.
>>>
>>>> If the complete
>>>> list of RAS nodes is visible for any process on the system (and one 
>>>> wants to
>>>> avoid requiring CAP_NET_ADMIN), there should be some way to enforce
>>>> permission checks when performing these operations if desired.
>>>
>>> Right now, there's no way that the driver would choose not avoid 
>>> requiring
>>> CAP_NET_ADMIN...
>>>
>>> Only way would be the admin to give the cap_net_admin to the tool with:
>>>
>>> $ sudo setcap cap_net_admin+ep /bin/drm_ras_tool
>>>
>>> but not ideal and not granular anyway...
>>>
>>>>
>>>> For example, this might be implemented in the driver's definition of
>>>> callback functions like query_error_counter; some drivers may want 
>>>> to ensure
>>>> that the process can in fact open the file descriptor corresponding 
>>>> to the
>>>> queried device before serving a netlink request. Is it enough for a 
>>>> driver
>>>> to simply return -EPERM in this case? Any driver that doesnt wish to 
>>>> protect
>>>> its RAS nodes need not implement checks in their callbacks.
>>>
>>> Fair enough. If we want to give the option to the drivers, then we need:
>>>
>>> 1. to first remove all the admin-perm flags below and leave the 
>>> driver to
>>> pick up their policy on when to return something or -EPERM.
>>> 2. Document this security responsibility and list a few possibilities.
>>> 3. In our Xe case here I believe the easiest option is to use 
>>> something like:
>>>
>>> struct scm_creds *creds = NETLINK_CREDS(cb->skb);
>>> if (!gid_eq(creds->gid, GLOBAL_ROOT_GID))
>>>      return -EPERM
>>
>> The driver currently does not have access to the callback or the 
>> skbuffer. Sending these details as param to driver won't be right as
>> drm_ras needs to handle all the netlink buffers.
>>
>> How about using pre_doit & start calls? If driver has a pre callback , 
>> it's the responsibility of the driver to check permissions/any-pre 
>> conditions, else the CAP_NET_ADMIN permission will be checked.
>>
>> @Zack / @Rodrigo thoughts?
>> @Zack Will this work for your usecase?
>>
>> yaml
>> +    dump:
>> +        pre: drm-ras-nl-pre-list-nodes
>>
>>
>> drm_ras.c :
>>
>> +       if (node->pre_list_nodes)
>> +                return node->pre_list_nodes(node);
>> +
>> +        return check_permissions(cb->skb);  //Checks creds
>>
>> Thanks
>> Riana
>>
> 
> I agree that a pre_doit is probably the best solution for this.
> 
> Not entirely sure what a driver specific implementation would look like 
> yet, but I think that as long as the driver callback has a way to access 
> the 'current' task_struct pointer corresponding to the user process then 
> this approach seems like the best option from the netlink side.
> 
> Since this is mostly a concern for our specific use case, perhaps we can 
> incorporate this functionality in our change down the road when we 
> expand the interface for telemetry?


Yeah using pre_doit we can allow driver to make decisions based on
the private data or driver specific permissions. However we will need to 
check the CAP_NET_ADMIN when no driver callback is implemented in the 
netlink layer as a default .

Thank you, you can incorporate the changes when you add telemetry nodes.

For now, I will retain the admin-perm in flags.

I will address the rest of the review comments and send out a new 
revision shortly.

Thank you
Riana


> 
> Let me know what you think.
> 
> Zack
> 
>>>
>>> or something like that?!
>>>
>>> perhaps drivers could implement some form of cookie or pre- 
>>> authorization with
>>> ioctls or sysfs, and then store in the priv?
>>>
>>> Thoughts?
>>> Other options?
>>>
>>>>
>>>> I dont see any such permissions checks in your driver implementation 
>>>> which
>>>> is understandable given that it may not be necessary for your use 
>>>> cases.
>>>> However, this would be a concern for our driver if we were to adopt 
>>>> this
>>>> interface.
>>>
>>> yeap, this case was entirely with admin-perm, so not needed at all...
>>> But I see your point and this is really not giving any flexibility to
>>> other drivers.
>>>
> 
>>>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Zack
>>>>
>>
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ