lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <PH0PR12MB54811346FB97E174AD134F88DC3C2@PH0PR12MB5481.namprd12.prod.outlook.com>
Date: Thu, 4 Apr 2024 18:55:09 +0000
From: Parav Pandit <parav@...dia.com>
To: David Wei <dw@...idwei.uk>, "netdev@...r.kernel.org"
	<netdev@...r.kernel.org>, "davem@...emloft.net" <davem@...emloft.net>,
	"edumazet@...gle.com" <edumazet@...gle.com>, "kuba@...nel.org"
	<kuba@...nel.org>, "pabeni@...hat.com" <pabeni@...hat.com>, "corbet@....net"
	<corbet@....net>, "kalesh-anakkur.purayil@...adcom.com"
	<kalesh-anakkur.purayil@...adcom.com>
CC: Saeed Mahameed <saeedm@...dia.com>, "leon@...nel.org" <leon@...nel.org>,
	"jiri@...nulli.us" <jiri@...nulli.us>, Shay Drori <shayd@...dia.com>, Dan
 Jurgens <danielj@...dia.com>, Dima Chumak <dchumak@...dia.com>,
	"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
	"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>, Jiri Pirko
	<jiri@...dia.com>
Subject: RE: [net-next v3 1/2] devlink: Support setting max_io_eqs



> From: David Wei <dw@...idwei.uk>
> Sent: Friday, April 5, 2024 12:23 AM
> 
> On 2024-04-04 11:50, Parav Pandit wrote:
> >
> >
> >> From: David Wei <dw@...idwei.uk>
> >> Sent: Friday, April 5, 2024 12:10 AM
> >>
> >> On 2024-04-04 10:58, Parav Pandit wrote:
> >>> Hi David,
> >>>
> >>>> From: David Wei <dw@...idwei.uk>
> >>>> Sent: Thursday, April 4, 2024 10:29 PM
> >>>>
> >>>> On 2024-04-03 10:41, Parav Pandit wrote:
> >>>>> Many devices send event notifications for the IO queues, such as
> >>>>> tx and rx queues, through event queues.
> >>>>>
> >>>>> Enable a privileged owner, such as a hypervisor PF, to set the
> >>>>> number of IO event queues for the VF and SF during the
> >>>>> provisioning
> >> stage.
> >>>>>
> >>>>> example:
> >>>>> Get maximum IO event queues of the VF device::
> >>>>>
> >>>>>   $ devlink port show pci/0000:06:00.0/2
> >>>>>   pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf
> >>>>> pfnum 0
> >>>> vfnum 1
> >>>>>       function:
> >>>>>           hw_addr 00:00:00:00:00:00 ipsec_packet disabled
> >>>>> max_io_eqs
> >>>>> 10
> >>>>>
> >>>>> Set maximum IO event queues of the VF device::
> >>>>>
> >>>>>   $ devlink port function set pci/0000:06:00.0/2 max_io_eqs 32
> >>>>>
> >>>>>   $ devlink port show pci/0000:06:00.0/2
> >>>>>   pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf
> >>>>> pfnum 0
> >>>> vfnum 1
> >>>>>       function:
> >>>>>           hw_addr 00:00:00:00:00:00 ipsec_packet disabled
> >>>>> max_io_eqs
> >>>>> 32
> >>>>>
> >>>>> Reviewed-by: Jiri Pirko <jiri@...dia.com>
> >>>>> Reviewed-by: Shay Drory <shayd@...dia.com>
> >>>>> Signed-off-by: Parav Pandit <parav@...dia.com>
> >>>>> ---
> >>>>> changelog:
> >>>>> v2->v3:
> >>>>> - limited 80 chars per line
> >>>>> v1->v2:
> >>>>> - limited comment to 80 chars per line in header file
> >>>>> ---
> >>>>>  .../networking/devlink/devlink-port.rst       | 25 +++++++++
> >>>>>  include/net/devlink.h                         | 14 +++++
> >>>>>  include/uapi/linux/devlink.h                  |  1 +
> >>>>>  net/devlink/port.c                            | 53 +++++++++++++++++++
> >>>>>  4 files changed, 93 insertions(+)
> >>>>>
> >>>>> diff --git a/Documentation/networking/devlink/devlink-port.rst
> >>>>> b/Documentation/networking/devlink/devlink-port.rst
> >>>>> index 562f46b41274..451f57393f11 100644
> >>>>> --- a/Documentation/networking/devlink/devlink-port.rst
> >>>>> +++ b/Documentation/networking/devlink/devlink-port.rst
> >>>>> @@ -134,6 +134,9 @@ Users may also set the IPsec crypto capability
> >>>>> of the function using  Users may also set the IPsec packet
> >>>>> capability of the function using  `devlink port function set
> >>>>> ipsec_packet`
> >> command.
> >>>>>
> >>>>> +Users may also set the maximum IO event queues of the function
> >>>>> +using `devlink port function set max_io_eqs` command.
> >>>>> +
> >>>>>  Function attributes
> >>>>>  ===================
> >>>>>
> >>>>> @@ -295,6 +298,28 @@ policy is processed in software by the kernel.
> >>>>>          function:
> >>>>>              hw_addr 00:00:00:00:00:00 ipsec_packet enabled
> >>>>>
> >>>>> +Maximum IO events queues setup
> >>>>> +------------------------------
> >>>>> +When user sets maximum number of IO event queues for a SF or a
> >>>>> +VF, such function driver is limited to consume only enforced
> >>>>> +number of IO event queues.
> >>>>> +
> >>>>> +- Get maximum IO event queues of the VF device::
> >>>>> +
> >>>>> +    $ devlink port show pci/0000:06:00.0/2
> >>>>> +    pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour
> >>>>> + pcivf pfnum
> >>>> 0 vfnum 1
> >>>>> +        function:
> >>>>> +            hw_addr 00:00:00:00:00:00 ipsec_packet disabled
> >>>>> + max_io_eqs 10
> >>>>> +
> >>>>> +- Set maximum IO event queues of the VF device::
> >>>>> +
> >>>>> +    $ devlink port function set pci/0000:06:00.0/2 max_io_eqs 32
> >>>>> +
> >>>>> +    $ devlink port show pci/0000:06:00.0/2
> >>>>> +    pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour
> >>>>> + pcivf pfnum
> >>>> 0 vfnum 1
> >>>>> +        function:
> >>>>> +            hw_addr 00:00:00:00:00:00 ipsec_packet disabled
> >>>>> + max_io_eqs 32
> >>>>> +
> >>>>>  Subfunction
> >>>>>  ============
> >>>>>
> >>>>> diff --git a/include/net/devlink.h b/include/net/devlink.h index
> >>>>> 9ac394bdfbe4..bb1af599d101 100644
> >>>>> --- a/include/net/devlink.h
> >>>>> +++ b/include/net/devlink.h
> >>>>> @@ -1602,6 +1602,14 @@ void devlink_free(struct devlink *devlink);
> >>>>>   *			      capability. Should be used by device
> drivers to
> >>>>>   *			      enable/disable ipsec_packet capability of
> a
> >>>>>   *			      function managed by the devlink port.
> >>>>> + * @port_fn_max_io_eqs_get: Callback used to get port function's
> >>>> maximum number
> >>>>> + *			    of event queues. Should be used by device
> drivers
> >>>> to
> >>>>> + *			    report the maximum event queues of a
> function
> >>>>> + *			    managed by the devlink port.
> >>>>> + * @port_fn_max_io_eqs_set: Callback used to set port function's
> >>>> maximum number
> >>>>> + *			    of event queues. Should be used by device
> drivers
> >>>> to
> >>>>> + *			    configure maximum number of event
> queues
> >>>>> + *			    of a function managed by the devlink port.
> >>>>>   *
> >>>>>   * Note: Driver should return -EOPNOTSUPP if it doesn't support
> >>>>>   * port function (@port_fn_*) handling for a particular port.
> >>>>> @@ -1651,6 +1659,12 @@ struct devlink_port_ops {
> >>>>>  	int (*port_fn_ipsec_packet_set)(struct devlink_port *devlink_port,
> >>>>>  					bool enable,
> >>>>>  					struct netlink_ext_ack *extack);
> >>>>> +	int (*port_fn_max_io_eqs_get)(struct devlink_port
> *devlink_port,
> >>>>> +				      u32 *max_eqs,
> >>>>> +				      struct netlink_ext_ack *extack);
> >>>>> +	int (*port_fn_max_io_eqs_set)(struct devlink_port
> *devlink_port,
> >>>>> +				      u32 max_eqs,
> >>>>> +				      struct netlink_ext_ack *extack);
> >>>>>  };
> >>>>>
> >>>>>  void devlink_port_init(struct devlink *devlink, diff --git
> >>>>> a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
> >>>>> index
> >>>>> 2da0c7eb6710..9401aa343673 100644
> >>>>> --- a/include/uapi/linux/devlink.h
> >>>>> +++ b/include/uapi/linux/devlink.h
> >>>>> @@ -686,6 +686,7 @@ enum devlink_port_function_attr {
> >>>>>  	DEVLINK_PORT_FN_ATTR_OPSTATE,	/* u8 */
> >>>>>  	DEVLINK_PORT_FN_ATTR_CAPS,	/* bitfield32 */
> >>>>>  	DEVLINK_PORT_FN_ATTR_DEVLINK,	/* nested */
> >>>>> +	DEVLINK_PORT_FN_ATTR_MAX_IO_EQS,	/* u32 */
> >>>>>
> >>>>>  	__DEVLINK_PORT_FUNCTION_ATTR_MAX,
> >>>>>  	DEVLINK_PORT_FUNCTION_ATTR_MAX =
> >>>> __DEVLINK_PORT_FUNCTION_ATTR_MAX -
> >>>>> 1 diff --git a/net/devlink/port.c b/net/devlink/port.c index
> >>>>> 118d130d2afd..be9158b4453c 100644
> >>>>> --- a/net/devlink/port.c
> >>>>> +++ b/net/devlink/port.c
> >>>>> @@ -16,6 +16,7 @@ static const struct nla_policy
> >>>> devlink_function_nl_policy[DEVLINK_PORT_FUNCTION_
> >>>>>  				 DEVLINK_PORT_FN_STATE_ACTIVE),
> >>>>>  	[DEVLINK_PORT_FN_ATTR_CAPS] =
> >>>>>
> >>>> 	NLA_POLICY_BITFIELD32(DEVLINK_PORT_FN_CAPS_VALID_MASK),
> >>>>> +	[DEVLINK_PORT_FN_ATTR_MAX_IO_EQS] = { .type =
> NLA_U32 },
> >>>>>  };
> >>>>>
> >>>>>  #define ASSERT_DEVLINK_PORT_REGISTERED(devlink_port)
> >>>> 		\
> >>>>> @@ -182,6 +183,30 @@ static int devlink_port_fn_caps_fill(struct
> >>>> devlink_port *devlink_port,
> >>>>>  	return 0;
> >>>>>  }
> >>>>>
> >>>>> +static int devlink_port_fn_max_io_eqs_fill(struct devlink_port *port,
> >>>>> +					   struct sk_buff *msg,
> >>>>> +					   struct netlink_ext_ack
> *extack,
> >>>>> +					   bool *msg_updated)
> >>>>> +{
> >>>>> +	u32 max_io_eqs;
> >>>>> +	int err;
> >>>>> +
> >>>>> +	if (!port->ops->port_fn_max_io_eqs_get)
> >>>>> +		return 0;
> >>>>> +
> >>>>> +	err = port->ops->port_fn_max_io_eqs_get(port,
> &max_io_eqs,
> >>>> extack);
> >>>>> +	if (err) {
> >>>>> +		if (err == -EOPNOTSUPP)
> >>>>> +			return 0;
> >>>>
> >>>> Docs above says:
> >>>>    * Note: Driver should return -EOPNOTSUPP if it doesn't support
> >>>>    * port function (@port_fn_*) handling for a particular port.
> >>>>
> >>>> But here you're returning 0 in both cases of no
> >>>> port_fn_max_io_eqs_get or
> >>>> port_fn_max_io_eqs_get() returns EOPNOTSUPP.
> >>>>
> >>> When the port does not support this op, the function pointer is null
> >>> and, 0
> >> is returned as expected.
> >>>
> >>> When the port for some reason has the ops function pointer set for a
> >>> port,
> >> but if the port does not support the ops, it will return ENOPNOTSUPP.
> >>> This may be possible when the driver has chosen to use same ops
> >>> callback
> >> structure for multiple port flavors.
> >>>
> >>> This code pattern is likely left over code of relatively recent work
> >>> that
> >> moved ops from devlink instance to per port ops.
> >>> I propose to keep the current check as done in this patch, and run a
> >>> full audit of all the drivers, if all drivers have moved to per port
> >>> ops, then
> >> simplify the code to drop the check for EOPNOTSUPP in a new series
> >> that may touch more drivers.
> >>> Otherwise, we may end up failing the port show operation when it
> >>> returns
> >> - ENOPNOTSUPP.
> >>
> >> Thanks for the explanation. So ideally each port flavour has its own
> >> unique set of struct ops, and if something is not supported then
> >> don't set the func ptr in the struct ops.
> >>
> >> Yes, I see that 0 has to be returned for devlink_port_fn_caps_fill() to
> succeed.
> >>
> >> Had a brief look and there's only a handful of drivers (mlx, nfp,
> >> ice) that use devlink_port_ops.
> >>
> > And netdevsim too. :)
> > Can we please the cleanup in a separate series?
> > Post this may be later in the month, I may have cycles to audit and simplify.
> > Would it be ok with you?
> 
> Yeah of course, thanks. I'm making some changes in netdevsim and can do a
> drive-by cleanup there too.
>
Ok. yes, that will be helpful. Thanks.
 
> >
> >>>
> >>>>> +		return err;
> >>>>> +	}
> >>>>> +	err = nla_put_u32(msg,
> DEVLINK_PORT_FN_ATTR_MAX_IO_EQS,
> >>>> max_io_eqs);
> >>>>> +	if (err)
> >>>>> +		return err;
> >>>>> +	*msg_updated = true;
> >>>>> +	return 0;
> >>>>> +}
> >>>>> +
> >>>>>  int devlink_nl_port_handle_fill(struct sk_buff *msg, struct
> >>>>> devlink_port *devlink_port)  {
> >>>>>  	if (devlink_nl_put_handle(msg, devlink_port->devlink)) @@ -409,6
> >>>>> +434,18 @@ static int devlink_port_fn_caps_set(struct devlink_port
> >>>> *devlink_port,
> >>>>>  	return 0;
> >>>>>  }
> >>>>>
> >>>>> +static int
> >>>>> +devlink_port_fn_max_io_eqs_set(struct devlink_port *devlink_port,
> >>>>> +			       const struct nlattr *attr,
> >>>>> +			       struct netlink_ext_ack *extack) {
> >>>>> +	u32 max_io_eqs;
> >>>>> +
> >>>>> +	max_io_eqs = nla_get_u32(attr);
> >>>>> +	return devlink_port->ops-
> >port_fn_max_io_eqs_set(devlink_port,
> >>>>> +							 max_io_eqs,
> extack);
> >>>>> +}
> >>>>> +
> >>>>>  static int
> >>>>>  devlink_nl_port_function_attrs_put(struct sk_buff *msg, struct
> >>>> devlink_port *port,
> >>>>>  				   struct netlink_ext_ack *extack) @@ -428,6
> >>>> +465,9 @@
> >>>>> devlink_nl_port_function_attrs_put(struct sk_buff *msg, struct
> >>>> devlink_port *por
> >>>>>  	if (err)
> >>>>>  		goto out;
> >>>>>  	err = devlink_port_fn_state_fill(port, msg, extack,
> >>>>> &msg_updated);
> >>>>> +	if (err)
> >>>>> +		goto out;
> >>>>> +	err = devlink_port_fn_max_io_eqs_fill(port, msg, extack,
> >>>>> +&msg_updated);
> >>>>>  	if (err)
> >>>>>  		goto out;
> >>>>>  	err = devlink_rel_devlink_handle_put(msg, port->devlink, @@
> >>>>> -726,6
> >>>>> +766,12 @@ static int devlink_port_function_validate(struct
> >>>>> +devlink_port
> >>>> *devlink_port,
> >>>>>  			}
> >>>>>  		}
> >>>>>  	}
> >>>>> +	if (tb[DEVLINK_PORT_FN_ATTR_MAX_IO_EQS] &&
> >>>>> +	    !ops->port_fn_max_io_eqs_set) {
> >>>>> +		NL_SET_ERR_MSG_ATTR(extack,
> >>>> tb[DEVLINK_PORT_FN_ATTR_MAX_IO_EQS],
> >>>>> +				    "Function does not support
> max_io_eqs
> >>>> setting");
> >>>>> +		return -EOPNOTSUPP;
> >>>>> +	}
> >>>>>  	return 0;
> >>>>>  }
> >>>>>
> >>>>> @@ -761,6 +807,13 @@ static int devlink_port_function_set(struct
> >>>> devlink_port *port,
> >>>>>  			return err;
> >>>>>  	}
> >>>>>
> >>>>> +	attr = tb[DEVLINK_PORT_FN_ATTR_MAX_IO_EQS];
> >>>>> +	if (attr) {
> >>>>> +		err = devlink_port_fn_max_io_eqs_set(port, attr,
> extack);
> >>>>> +		if (err)
> >>>>> +			return err;
> >>>>> +	}
> >>>>> +
> >>>>>  	/* Keep this as the last function attribute set, so that when
> >>>>>  	 * multiple port function attributes are set along with state,
> >>>>>  	 * Those can be applied first before activating the state.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ