lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iLXMK0edE0xdZgD9aqLkGr32tOjOyyHKAnBbgCYhZt+jw@mail.gmail.com>
Date: Mon, 21 Jul 2025 01:21:00 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: Krishna Kumar <krikku@...il.com>
Cc: netdev@...r.kernel.org, davem@...emloft.net, tom@...bertland.com, 
	kuba@...nel.org, pabeni@...hat.com, horms@...nel.org, sdf@...ichev.me, 
	kuniyu@...gle.com, ahmed.zaki@...el.com, aleksander.lobakin@...el.com, 
	atenart@...nel.org, jdamato@...tly.com, krishna.ku@...pkart.com
Subject: Re: [PATCH v5 net-next 1/2] net: Prevent RPS table overwrite for
 active flows

On Sun, Jul 20, 2025 at 8:16 PM Krishna Kumar <krikku@...il.com> wrote:
>
> This patch fixes an issue where two different flows on the same RXq
> produce the same hash resulting in continuous flow overwrites.
>
> Flow #1: A packet for Flow #1 comes in, kernel calls the steering
>          function. The driver gives back a filter id. The kernel saves
>          this filter id in the selected slot. Later, the driver's
>          service task checks if any filters have expired and then
>          installs the rule for Flow #1.
> Flow #2: A packet for Flow #2 comes in. It goes through the same steps.
>          But this time, the chosen slot is being used by Flow #1. The
>          driver gives a new filter id and the kernel saves it in the
>          same slot. When the driver's service task runs, it runs through
>          all the flows, checks if Flow #1 should be expired, the kernel
>          returns True as the slot has a different filter id, and then
>          the driver installs the rule for Flow #2.
> Flow #1: Another packet for Flow #1 comes in. The same thing repeats.
>          The slot is overwritten with a new filter id for Flow #1.
>
> This causes a repeated cycle of flow programming for missed packets,
> wasting CPU cycles while not improving performance. This problem happens
> at higher rates when the RPS table is small, but tests show it still
> happens even with 12,000 connections and an RPS size of 16K per queue
> (global table size = 144x16K = 64K).
>
>
> Reported-by: kernel test robot <lkp@...el.com>
> Closes: https://lore.kernel.org/oe-kbuild-all/202507161125.rUCoz9ov-lkp@intel.com/
> Signed-off-by: Krishna Kumar <krikku@...il.com>
> ---
>  include/net/rps.h    |  5 +--
>  net/core/dev.c       | 82 ++++++++++++++++++++++++++++++++++++++++----
>  net/core/net-sysfs.c |  4 ++-
>  3 files changed, 81 insertions(+), 10 deletions(-)
>
> diff --git a/include/net/rps.h b/include/net/rps.h
> index d8ab3a08bcc4..8e33dbea9327 100644
> --- a/include/net/rps.h
> +++ b/include/net/rps.h
> @@ -25,13 +25,14 @@ struct rps_map {
>

> +#ifdef CONFIG_RFS_ACCEL
> +/**
> + * rps_flow_is_active - check whether the flow is recently active.
> + * @rflow: Specific flow to check activity.
> + * @flow_table: Check activity against the flow_table's size.
> + * @cpu: CPU saved in @rflow.
> + *
> + * If the CPU has processed many packets since the flow's last activity
> + * (beyond 10 times the table size), the flow is considered stale.
> + *
> + * Return: true if flow was recently active.
> + */
> +static bool rps_flow_is_active(struct rps_dev_flow *rflow,
> +                              struct rps_dev_flow_table *flow_table,
> +                              unsigned int cpu)
> +{
> +       return cpu < nr_cpu_ids &&
> +              ((int)(READ_ONCE(per_cpu(softnet_data, cpu).input_queue_head) -
> +               READ_ONCE(rflow->last_qtail)) < (int)(10 << flow_table->log));
> +}
> +#endif

This notion of active flow is kind of weird.
It might be time to make it less obscure, less expensive and time
(jiffies ?) deterministic.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ