lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240422113943.736861fc@kernel.org>
Date: Mon, 22 Apr 2024 11:39:43 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Heng Qi <hengqi@...ux.alibaba.com>
Cc: netdev@...r.kernel.org, virtualization@...ts.linux.dev, "David S .
 Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Paolo
 Abeni <pabeni@...hat.com>, Jason Wang <jasowang@...hat.com>, "Michael S .
 Tsirkin" <mst@...hat.com>, Brett Creeley <bcreeley@....com>, Ratheesh
 Kannoth <rkannoth@...vell.com>, Alexander Lobakin
 <aleksander.lobakin@...el.com>, Xuan Zhuo <xuanzhuo@...ux.alibaba.com>, Tal
 Gilboa <talgi@...dia.com>, Jonathan Corbet <corbet@....net>,
 linux-doc@...r.kernel.org, Maxime Chevallier
 <maxime.chevallier@...tlin.com>, Jiri Pirko <jiri@...nulli.us>, Paul
 Greenwalt <paul.greenwalt@...el.com>, Ahmed Zaki <ahmed.zaki@...el.com>,
 Vladimir Oltean <vladimir.oltean@....com>, Kory Maincent
 <kory.maincent@...tlin.com>, Andrew Lunn <andrew@...n.ch>,
 "justinstitt@...gle.com" <justinstitt@...gle.com>
Subject: Re: [PATCH net-next v9 2/4] ethtool: provide customized dim profile
 management

On Mon, 22 Apr 2024 17:00:25 +0800 Heng Qi wrote:
> 在 2024/4/19 上午8:48, Jakub Kicinski 写道:
> > On Wed, 17 Apr 2024 23:55:44 +0800 Heng Qi wrote:  
> >> $ ethtool -c ethx
> >> ...
> >> rx-eqe-profile:
> >> {.usec =   1, .pkts = 256, .comps =   0,},
> >> {.usec =   8, .pkts = 256, .comps =   0,},
> >> {.usec =  64, .pkts = 256, .comps =   0,},
> >> {.usec = 128, .pkts = 256, .comps =   0,},
> >> {.usec = 256, .pkts = 256, .comps =   0,}
> >> rx-cqe-profile:   n/a
> >> tx-eqe-profile:   n/a
> >> tx-cqe-profile:   n/a  
> > I don't think that exposing CQE vs EQE mode makes much sense here.
> > We enable the right mode via:
> >
> > struct kernel_ethtool_coalesce {
> > 	u8 use_cqe_mode_tx;
> > 	u8 use_cqe_mode_rx;  
> 
> I think it is reasonable to use 4 types:
> 
> dim lib itself is designed with 4 independent arrays, which are queried 
> by dim->mode and
> dim->profile_ix. This way allows users to manually modify the cqe mode 
> of tx or rx, and the
> dim interface can automatically match the profile of the corresponding 
> mode when obtaining
> the results.

I'm guessing that DIM has 4 profiles because it was easier to implement
for profiles hardcoded by DIM itself(!) Even for driver-declared
profiles the distinction is a hack.

> Merely exposing rx_profile and tx_profile does not seem to make the 
> interface and code clearer.
> 
> > the user needs to set the packets and usecs in an educated way
> > (matching the mode).  
> 
> I think this is acceptable. In addition to the above reasons, users can 
> already set the cqe
> mode of tx and rx in user mode, which means that they have been educated.
> 
> > The kernel never changes the mode.  
> 
> Sorry, I don't seem to understand what this means.

Kernel never changes the mode for its own reasons.
Only user can change the mode.
So we don't have to always have both CQE and EQE mode tables ready.
If the kernel changed the mode for some reason we'd have to get both
from the user, in case kernel wanted to switch.

> >>   /**
> >>    * register_netdevice() - register a network device
> >>    * @dev: device to register
> >> @@ -10258,6 +10310,10 @@ int register_netdevice(struct net_device *dev)
> >>   	if (ret)
> >>   		return ret;
> >>   
> >> +	ret = dev_dim_profile_init(dev);
> >> +	if (ret)
> >> +		return ret;  
> > This is fine but the driver still has to manually do bunch of init:
> >
> > 		INIT_WORK(&vi->rq[i].dim.work, virtnet_rx_dim_work);
> > 		vi->rq[i].dim.mode = DIM_CQ_PERIOD_MODE_START_FROM_EQE;
> >
> > It'd be better to collect all this static info (flags, mode, work func)
> > in one place / struct, attached to netdev or netdev_ops. Then the
> > driver can call a helper like which only needs to take netdev and dim
> > as arguments.  
> 
> If mode is put into netdev, then use_cqe_mode_rx and use_cqe_mode_tx 
> need to be moved into netdev too.
> Then dim->mode is no longer required when dim obtain its results.
> We need to transform the way all drivers with dim currently behave.

Hopefully that won't be too hard.

> But why should work be put into netdev?
> The driver still requires a for loop to initialize dim work for a 
> driver-specific queues.

It's easier to refactor and extend the API when there's an explicit
call done to the core by the driver, rather than driver poking values
into random fields of the core structure.

> >> + * coalesce_put_profile - fill reply with a nla nest with four child nla nests.
> >> + * @skb: socket buffer the message is stored in
> >> + * @attr_type: nest attr type ETHTOOL_A_COALESCE_*X_*QE_PROFILE
> >> + * @profile: data passed to userspace
> >> + * @supported_params: modifiable parameters supported by the driver
> >> + *
> >> + * Put a dim profile nest attribute. Refer to ETHTOOL_A_MODERATIONS_MODERATION.
> >> + *
> >> + * Return: false to indicate successful placement or no placement, and
> >> + * true to pass the -EMSGSIZE error to the wrapper.  
> > Why the bool? Doesn't most of the similar code return the error?  
> 
> Its wrapper function ethnl_default_doit only recognizes the EMSGSIZE code.

Not sure what you mean.

> >> @@ -227,7 +315,19 @@ const struct nla_policy ethnl_coalesce_set_policy[] = {
> >>   	[ETHTOOL_A_COALESCE_TX_AGGR_MAX_BYTES] = { .type = NLA_U32 },
> >>   	[ETHTOOL_A_COALESCE_TX_AGGR_MAX_FRAMES] = { .type = NLA_U32 },
> >>   	[ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS] = { .type = NLA_U32 },
> >> +	[ETHTOOL_A_COALESCE_RX_EQE_PROFILE]     = { .type = NLA_NESTED },
> >> +	[ETHTOOL_A_COALESCE_RX_CQE_PROFILE]     = { .type = NLA_NESTED },
> >> +	[ETHTOOL_A_COALESCE_TX_EQE_PROFILE]     = { .type = NLA_NESTED },
> >> +	[ETHTOOL_A_COALESCE_TX_CQE_PROFILE]     = { .type = NLA_NESTED },  
> > NLA_POLICY_NESTED(coalesce_set_profile_policy)  
> 
> This doesn't work because the first level of sub-nesting of
> ETHTOOL_A_COALESCE_RX_CQE_PROFILE is ETHTOOL_A_PROFILE_IRQ_MODERATION.

So declare a policy for IRQ_MODERATION which has one entry -> nested
profile policy?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ