lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZqLEfyNLtCy25g6w@C02YVCJELVCG.dhcp.broadcom.net>
Date: Thu, 25 Jul 2024 17:33:10 -0400
From: Andy Gospodarek <andrew.gospodarek@...adcom.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Michael Chan <michael.chan@...adcom.com>, davem@...emloft.net,
	netdev@...r.kernel.org, edumazet@...gle.com, pabeni@...hat.com,
	pavan.chebbi@...adcom.com, andrew.gospodarek@...adcom.com
Subject: Re: [PATCH] bnxt_en: Fix RSS logic in __bnxt_reserve_rings()

On Thu, Jul 25, 2024 at 11:19:12AM -0700, Jakub Kicinski wrote:
> On Wed, 24 Jul 2024 17:25:36 -0700 Jakub Kicinski wrote:
> > On Wed, 24 Jul 2024 15:21:06 -0700 Michael Chan wrote:
> > > Now, with RSS contexts support, if the user has added or deleted RSS
> > > contexts, we may now enter this path to reserve the new number of VNICs.
> > > However, netif_is_rxfh_configured() will not return the correct state if
> > > we are still in the middle of set_rxfh().  So the existing code may
> > > set the indirection table of the default RSS context to default by
> > > mistake.  
> > 
> > I feel like my explanation was more clear :S
> > 
> > The key point is that ethtool::set_rxfh() calls the "reload" functions
> > and expects the scope of the "reload" to be quite narrow, because only
> > the RSS table has changed. Unfortunately the add / delete of additional
> > contexts de-sync the resource counts, so ethtool::set_rxfh() now ends
> > up "reloading" more than it intended. The "more than intended" includes
> > going down the RSS indir reset path, which calls netif_is_rxfh_configured().
> > Return value from netif_is_rxfh_configured() during ethtool::set_rxfh()
> > is undefined.
> > 
> > Reported tag would have been nice too..
> 

Agreed.  Sorry that was missed.

> Reported-and-tested-by: Jakub Kicinski <kuba@...nel.org>
> Link: https://lore.kernel.org/20240625010210.2002310-1-kuba@kernel.org
> 
> There's one more problem. It looks like changing queue count discards
> existing ntuple filters:
> 
> # Check| At /root/./ksft/drivers/net/hw/rss_ctx.py, line 387, in test_rss_context_queue_reconfigure:
> # Check|     test_rss_queue_reconfigure(cfg, main_ctx=False)
> # Check| At /root/./ksft/drivers/net/hw/rss_ctx.py, line 230, in test_rss_queue_reconfigure:
> # Check|     _send_traffic_check(cfg, port, ctx_ref, { 'target': (0, 3),
> # Check| At /root/./ksft/drivers/net/hw/rss_ctx.py, line 92, in _send_traffic_check:
> # Check|     ksft_lt(sum(cnts[i] for i in params['noise']), directed / 2,
> # Check failed 1045235 >= 405823.5 traffic on other queues (context 1)':[460068, 351995, 565970, 351579, 127270]
> # Exception while handling defer / cleanup (callback 1 of 3)!
> # Defer Exception| Traceback (most recent call last):
> # Defer Exception|   File "/root/ksft/net/lib/py/ksft.py", line 129, in ksft_flush_defer
> # Defer Exception|     entry.exec_only()
> # Defer Exception|   File "/root/ksft/net/lib/py/utils.py", line 93, in exec_only
> # Defer Exception|     self.func(*self.args, **self.kwargs)
> # Defer Exception|   File "/root/ksft/net/lib/py/utils.py", line 121, in ethtool
> # Defer Exception|     return tool('ethtool', args, json=json, ns=ns, host=host)
> # Defer Exception|   File "/root/ksft/net/lib/py/utils.py", line 108, in tool
> # Defer Exception|     cmd_obj = cmd(cmd_str, ns=ns, host=host)
> # Defer Exception|   File "/root/ksft/net/lib/py/utils.py", line 32, in __init__
> # Defer Exception|     self.process(terminate=False, fail=fail, timeout=timeout)
> # Defer Exception|   File "/root/ksft/net/lib/py/utils.py", line 50, in process
> # Defer Exception|     raise CmdExitFailure("Command failed: %s\nSTDOUT: %s\nSTDERR: %s" %
> # Defer Exception| net.lib.py.utils.CmdExitFailure: Command failed: ethtool -N eth0 delete 0
> # Defer Exception| STDOUT: b''
> # Defer Exception| STDERR: b'rmgr: Cannot delete RX class rule: No such file or directory\nCannot delete classification rule\n'
> not ok 8 rss_ctx.test_rss_context_queue_reconfigure
> 
> This is from the following chunk of the test:
> 
>    225      # We should be able to increase queues, but table should be left untouched
>    226      ethtool(f"-L {cfg.ifname} combined 5")
>    227      data = get_rss(cfg, context=ctx_id)
>    228      ksft_eq({0, 3}, set(data['rss-indirection-table']))
>    229  
>    230      _send_traffic_check(cfg, port, ctx_ref, { 'target': (0, 3),
>    231                                                other_key: (1, 2, 4) })
> 
> The Check failure tells us the traffic was sprayed.
> The Defer Exception, well, self-explanatory: 
>   "Cannot delete RX class rule: No such file or directory"

We can take a look at that, but we currently do this on purpose.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ