lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250725154515.0bff0c4d@kernel.org>
Date: Fri, 25 Jul 2025 15:45:15 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Krishna Kumar <krikku@...il.com>
Cc: netdev@...r.kernel.org, davem@...emloft.net, edumazet@...gle.com,
 tom@...bertland.com, pabeni@...hat.com, horms@...nel.org, sdf@...ichev.me,
 kuniyu@...gle.com, ahmed.zaki@...el.com, aleksander.lobakin@...el.com,
 atenart@...nel.org, krishna.ku@...pkart.com
Subject: Re: [PATCH v6 net-next 1/2] net: Prevent RPS table overwrite for
 active flows

On Wed, 23 Jul 2025 11:46:03 +0530 Krishna Kumar wrote:
> +#ifdef CONFIG_RFS_ACCEL
> +/**
> + * rps_flow_is_active - check whether the flow is recently active.
> + * @rflow: Specific flow to check activity.
> + * @flow_table: Check activity against the flow_table's size.
> + * @cpu: CPU saved in @rflow.
> + *
> + * If the CPU has processed many packets since the flow's last activity
> + * (beyond 10 times the table size), the flow is considered stale.
> + *
> + * Return: true if flow was recently active.
> + */
> +static bool rps_flow_is_active(struct rps_dev_flow *rflow,
> +			       struct rps_dev_flow_table *flow_table,
> +			       unsigned int cpu)
> +{
> +	return cpu < nr_cpu_ids &&
> +	       ((int)(READ_ONCE(per_cpu(softnet_data, cpu).input_queue_head) -
> +		READ_ONCE(rflow->last_qtail)) < (int)(10 << flow_table->log));

Since you're factoring this condition out you could split it up a bit.
Save the result of at least that READ_ONCE() that takes up an entire
line to a temporary variable?

> +}
> +#endif
> +
>  static struct rps_dev_flow *
>  set_rps_cpu(struct net_device *dev, struct sk_buff *skb,
>  	    struct rps_dev_flow *rflow, u16 next_cpu)
> @@ -4847,8 +4869,11 @@ set_rps_cpu(struct net_device *dev, struct sk_buff *skb,
>  		struct netdev_rx_queue *rxqueue;
>  		struct rps_dev_flow_table *flow_table;
>  		struct rps_dev_flow *old_rflow;
> +		struct rps_dev_flow *tmp_rflow;
> +		unsigned int tmp_cpu;
>  		u16 rxq_index;
>  		u32 flow_id;
> +		u32 hash;
>  		int rc;
>  
>  		/* Should we steer this flow to a different hardware queue? */
> @@ -4863,14 +4888,58 @@ set_rps_cpu(struct net_device *dev, struct sk_buff *skb,
>  		flow_table = rcu_dereference(rxqueue->rps_flow_table);
>  		if (!flow_table)
>  			goto out;
> -		flow_id = rfs_slot(skb_get_hash(skb), flow_table);
> +
> +		hash = skb_get_hash(skb);
> +		flow_id = rfs_slot(hash, flow_table);
> +
> +		tmp_rflow = &flow_table->flows[flow_id];
> +		tmp_cpu = READ_ONCE(tmp_rflow->cpu);
> +
> +		/* Make sure this slot is usable before enabling steer */

This comment is less clear than the code..

> +		if (READ_ONCE(tmp_rflow->filter) != RPS_NO_FILTER) {
> +			/* This slot has an entry */
> +			if (rps_flow_is_active(tmp_rflow, flow_table,
> +					       tmp_cpu)) {
> +				/*
> +				 * This slot has an active "programmed" flow.
> +				 * Break out if the cached value stored is for
> +				 * a different flow, or (for our flow) the
> +				 * rx-queue# did not change.
> +				 */

This just restates what the code does

> +				if (hash != READ_ONCE(tmp_rflow->hash) ||
> +				    next_cpu == tmp_cpu) {
> +					/*
> +					 * Don't unnecessarily reprogram if:
> +					 * 1. This slot has an active different
> +					 *    flow.
> +					 * 2. This slot has the same flow (very
> +					 *    likely but not guaranteed) and
> +					 *    the rx-queue# did not change.
> +					 */

Again, over-commenting, the logic is clear enough.

> +					goto out;
> +				}
> +			}
> +
> +			/*
> +			 * When we overwrite the flow, the driver still has
> +			 * the cached entry. But drivers will check if the
> +			 * flow is active and rps_may_expire_entry() will
> +			 * return False and driver will delete it soon. Hence
> +			 * inconsistency between kernel & driver is quickly
> +			 * resolved.
> +			 */

I don't get this comment either, and rps_may_expire_entry() does not
exist in my tree.
-- 
pw-bot: cr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ