[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240725111912.7bc17cf6@kernel.org>
Date: Thu, 25 Jul 2024 11:19:12 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Michael Chan <michael.chan@...adcom.com>
Cc: davem@...emloft.net, netdev@...r.kernel.org, edumazet@...gle.com,
pabeni@...hat.com, pavan.chebbi@...adcom.com,
andrew.gospodarek@...adcom.com
Subject: Re: [PATCH] bnxt_en: Fix RSS logic in __bnxt_reserve_rings()
On Wed, 24 Jul 2024 17:25:36 -0700 Jakub Kicinski wrote:
> On Wed, 24 Jul 2024 15:21:06 -0700 Michael Chan wrote:
> > Now, with RSS contexts support, if the user has added or deleted RSS
> > contexts, we may now enter this path to reserve the new number of VNICs.
> > However, netif_is_rxfh_configured() will not return the correct state if
> > we are still in the middle of set_rxfh(). So the existing code may
> > set the indirection table of the default RSS context to default by
> > mistake.
>
> I feel like my explanation was more clear :S
>
> The key point is that ethtool::set_rxfh() calls the "reload" functions
> and expects the scope of the "reload" to be quite narrow, because only
> the RSS table has changed. Unfortunately the add / delete of additional
> contexts de-sync the resource counts, so ethtool::set_rxfh() now ends
> up "reloading" more than it intended. The "more than intended" includes
> going down the RSS indir reset path, which calls netif_is_rxfh_configured().
> Return value from netif_is_rxfh_configured() during ethtool::set_rxfh()
> is undefined.
>
> Reported tag would have been nice too..
Reported-and-tested-by: Jakub Kicinski <kuba@...nel.org>
Link: https://lore.kernel.org/20240625010210.2002310-1-kuba@kernel.org
There's one more problem. It looks like changing queue count discards
existing ntuple filters:
# Check| At /root/./ksft/drivers/net/hw/rss_ctx.py, line 387, in test_rss_context_queue_reconfigure:
# Check| test_rss_queue_reconfigure(cfg, main_ctx=False)
# Check| At /root/./ksft/drivers/net/hw/rss_ctx.py, line 230, in test_rss_queue_reconfigure:
# Check| _send_traffic_check(cfg, port, ctx_ref, { 'target': (0, 3),
# Check| At /root/./ksft/drivers/net/hw/rss_ctx.py, line 92, in _send_traffic_check:
# Check| ksft_lt(sum(cnts[i] for i in params['noise']), directed / 2,
# Check failed 1045235 >= 405823.5 traffic on other queues (context 1)':[460068, 351995, 565970, 351579, 127270]
# Exception while handling defer / cleanup (callback 1 of 3)!
# Defer Exception| Traceback (most recent call last):
# Defer Exception| File "/root/ksft/net/lib/py/ksft.py", line 129, in ksft_flush_defer
# Defer Exception| entry.exec_only()
# Defer Exception| File "/root/ksft/net/lib/py/utils.py", line 93, in exec_only
# Defer Exception| self.func(*self.args, **self.kwargs)
# Defer Exception| File "/root/ksft/net/lib/py/utils.py", line 121, in ethtool
# Defer Exception| return tool('ethtool', args, json=json, ns=ns, host=host)
# Defer Exception| File "/root/ksft/net/lib/py/utils.py", line 108, in tool
# Defer Exception| cmd_obj = cmd(cmd_str, ns=ns, host=host)
# Defer Exception| File "/root/ksft/net/lib/py/utils.py", line 32, in __init__
# Defer Exception| self.process(terminate=False, fail=fail, timeout=timeout)
# Defer Exception| File "/root/ksft/net/lib/py/utils.py", line 50, in process
# Defer Exception| raise CmdExitFailure("Command failed: %s\nSTDOUT: %s\nSTDERR: %s" %
# Defer Exception| net.lib.py.utils.CmdExitFailure: Command failed: ethtool -N eth0 delete 0
# Defer Exception| STDOUT: b''
# Defer Exception| STDERR: b'rmgr: Cannot delete RX class rule: No such file or directory\nCannot delete classification rule\n'
not ok 8 rss_ctx.test_rss_context_queue_reconfigure
This is from the following chunk of the test:
225 # We should be able to increase queues, but table should be left untouched
226 ethtool(f"-L {cfg.ifname} combined 5")
227 data = get_rss(cfg, context=ctx_id)
228 ksft_eq({0, 3}, set(data['rss-indirection-table']))
229
230 _send_traffic_check(cfg, port, ctx_ref, { 'target': (0, 3),
231 other_key: (1, 2, 4) })
The Check failure tells us the traffic was sprayed.
The Defer Exception, well, self-explanatory:
"Cannot delete RX class rule: No such file or directory"
Powered by blists - more mailing lists