[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250725154515.0bff0c4d@kernel.org>
Date: Fri, 25 Jul 2025 15:45:15 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Krishna Kumar <krikku@...il.com>
Cc: netdev@...r.kernel.org, davem@...emloft.net, edumazet@...gle.com,
tom@...bertland.com, pabeni@...hat.com, horms@...nel.org, sdf@...ichev.me,
kuniyu@...gle.com, ahmed.zaki@...el.com, aleksander.lobakin@...el.com,
atenart@...nel.org, krishna.ku@...pkart.com
Subject: Re: [PATCH v6 net-next 1/2] net: Prevent RPS table overwrite for
active flows
On Wed, 23 Jul 2025 11:46:03 +0530 Krishna Kumar wrote:
> +#ifdef CONFIG_RFS_ACCEL
> +/**
> + * rps_flow_is_active - check whether the flow is recently active.
> + * @rflow: Specific flow to check activity.
> + * @flow_table: Check activity against the flow_table's size.
> + * @cpu: CPU saved in @rflow.
> + *
> + * If the CPU has processed many packets since the flow's last activity
> + * (beyond 10 times the table size), the flow is considered stale.
> + *
> + * Return: true if flow was recently active.
> + */
> +static bool rps_flow_is_active(struct rps_dev_flow *rflow,
> + struct rps_dev_flow_table *flow_table,
> + unsigned int cpu)
> +{
> + return cpu < nr_cpu_ids &&
> + ((int)(READ_ONCE(per_cpu(softnet_data, cpu).input_queue_head) -
> + READ_ONCE(rflow->last_qtail)) < (int)(10 << flow_table->log));
Since you're factoring this condition out you could split it up a bit.
Save the result of at least that READ_ONCE() that takes up an entire
line to a temporary variable?
> +}
> +#endif
> +
> static struct rps_dev_flow *
> set_rps_cpu(struct net_device *dev, struct sk_buff *skb,
> struct rps_dev_flow *rflow, u16 next_cpu)
> @@ -4847,8 +4869,11 @@ set_rps_cpu(struct net_device *dev, struct sk_buff *skb,
> struct netdev_rx_queue *rxqueue;
> struct rps_dev_flow_table *flow_table;
> struct rps_dev_flow *old_rflow;
> + struct rps_dev_flow *tmp_rflow;
> + unsigned int tmp_cpu;
> u16 rxq_index;
> u32 flow_id;
> + u32 hash;
> int rc;
>
> /* Should we steer this flow to a different hardware queue? */
> @@ -4863,14 +4888,58 @@ set_rps_cpu(struct net_device *dev, struct sk_buff *skb,
> flow_table = rcu_dereference(rxqueue->rps_flow_table);
> if (!flow_table)
> goto out;
> - flow_id = rfs_slot(skb_get_hash(skb), flow_table);
> +
> + hash = skb_get_hash(skb);
> + flow_id = rfs_slot(hash, flow_table);
> +
> + tmp_rflow = &flow_table->flows[flow_id];
> + tmp_cpu = READ_ONCE(tmp_rflow->cpu);
> +
> + /* Make sure this slot is usable before enabling steer */
This comment is less clear than the code..
> + if (READ_ONCE(tmp_rflow->filter) != RPS_NO_FILTER) {
> + /* This slot has an entry */
> + if (rps_flow_is_active(tmp_rflow, flow_table,
> + tmp_cpu)) {
> + /*
> + * This slot has an active "programmed" flow.
> + * Break out if the cached value stored is for
> + * a different flow, or (for our flow) the
> + * rx-queue# did not change.
> + */
This just restates what the code does
> + if (hash != READ_ONCE(tmp_rflow->hash) ||
> + next_cpu == tmp_cpu) {
> + /*
> + * Don't unnecessarily reprogram if:
> + * 1. This slot has an active different
> + * flow.
> + * 2. This slot has the same flow (very
> + * likely but not guaranteed) and
> + * the rx-queue# did not change.
> + */
Again, over-commenting, the logic is clear enough.
> + goto out;
> + }
> + }
> +
> + /*
> + * When we overwrite the flow, the driver still has
> + * the cached entry. But drivers will check if the
> + * flow is active and rps_may_expire_entry() will
> + * return False and driver will delete it soon. Hence
> + * inconsistency between kernel & driver is quickly
> + * resolved.
> + */
I don't get this comment either, and rps_may_expire_entry() does not
exist in my tree.
--
pw-bot: cr
Powered by blists - more mailing lists