netdev - Re: [PATCH net-next v2 4/4] selftests: drv-net: rss_ctx: add tests for RSS configuration and contexts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240625065041.1c4cb856@kernel.org>
Date: Tue, 25 Jun 2024 06:50:41 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Petr Machata <petrm@...dia.com>
Cc: <davem@...emloft.net>, <netdev@...r.kernel.org>, <edumazet@...gle.com>,
 <pabeni@...hat.com>, <willemdebruijn.kernel@...il.com>,
 <ecree.xilinx@...il.com>, <dw@...idwei.uk>, <przemyslaw.kitszel@...el.com>,
 <michael.chan@...adcom.com>, <andrew.gospodarek@...adcom.com>
Subject: Re: [PATCH net-next v2 4/4] selftests: drv-net: rss_ctx: add tests
 for RSS configuration and contexts

On Tue, 25 Jun 2024 12:42:22 +0200 Petr Machata wrote:
> > +def test_rss_key_indir(cfg):
> > +    """
> > +    Test basics like updating the main RSS key and indirection table.
> > +    """
> > +    if len(_get_rx_cnts(cfg)) < 2:
> > +        KsftSkipEx("Device has only one queue (or doesn't support queue stats)")  
> 
> I'm not sure, is this admin-correctible configuration issue? It looks
> like this and some others should rather be XFAIL.

TBH I don't have a good compass on what should be XFAIL and what should
be SKIP in HW tests. Once vendors start running these we'll get more
experience (there's only one test using Xfail in HW now).
> > +    cnts = _get_rx_cnts(cfg)
> > +    GenerateTraffic(cfg).wait_pkts_and_stop(20000)
> > +    cnts = _get_rx_cnts(cfg, prev=cnts)
> > +    # First two queues get less traffic than all the rest
> > +    ksft_ge(sum(cnts[2:]), sum(cnts[:2]), "traffic distributed: " + str(cnts))  
> 
> Now that you added ksft_lt, this can be made to actually match the comment:
> 
>     ksft_lt(sum(cnts[:2]), sum(cnts[2:]), "traffic distributed: " + str(cnts))

ack

> > +    # Try to allocate more queues when necessary
> > +    qcnt = len(_get_rx_cnts(cfg))
> > +    if qcnt >= 2 + 2 * ctx_cnt:
> > +        qcnt = None
> > +    else:
> > +        try:
> > +            ksft_pr(f"Increasing queue count {qcnt} -> {2 + 2 * ctx_cnt}")
> > +            ethtool(f"-L {cfg.ifname} combined {2 + 2 * ctx_cnt}")
> > +        except:
> > +            raise KsftSkipEx("Not enough queues for the test")  
> 
> There are variations on this in each of the three tests. It would make
> sense to extract to a helper, or perhaps even write as a context
> manager. Untested:
> 
> class require_contexts:
>     def __init__(self, cfg, count):
>         self._cfg = cfg
>         self._count = count
>         self._qcnt = None
> 
>     def __enter__(self):
>         qcnt = len(_get_rx_cnts(self._cfg))
>         if qcnt >= self._count:
>             return
>         try:
>             ksft_pr(f"Increasing queue count {qcnt} -> {self._count}")
>             ethtool(f"-L {self._cfg.ifname} combined {self._count}")
>             self._qcnt = qcnt
>         except:
>             raise KsftSkipEx("Not enough queues for the test")
> 
>     def __exit__(self, exc_type, exc_value, traceback):
>         if self._qcnt is not None:
>             ethtool(f"-L {self._cfg.ifname} combined {self._qcnt}")
> 
> Then:
> 
>     with require_contexts(cfg, 2 + 2 * ctx_cnt):
>         ...

There are 4 things to clean up, with doesn't cover all of them
naturally and complicates the code.

Once again, I'm thinking about adding some form of deferred execution.
	
	ethtool(f"-L {self._cfg.ifname} combined {self._qcnt}")
	undo(ethtool, f"-L {self._cfg.ifname} combined {old_qcnt}")

Where cleanups will be executed in reverse order by ksft_run() after
the test, with the option to delete them.

	nid = ethtool_create(cfg, "-N", flow)
	ntuple = undo(ethtool, f"-N {cfg.ifname} delete {nid}")
	# .. code using ntuple ...
	ntuple.exec()
	# .. now ntuple is gone

or/and:

	nid = ethtool_create(cfg, "-N", flow)
	with undo(ethtool, f"-N {cfg.ifname} delete {nid}"):
		# .. code using ntuple ...
	# .. now ntuple is gone

Thoughts?

> > +        # Remove last context
> > +        remove_ctx(-1)
> > +        check_traffic()  
> 
> I feel like this works by luck, the removed[] indices are straight,
> whereas the ntuple[] and ctx_id[] ones shift around as elements are
> removed from the array. I don't mind terribly much, it's not very likely
> that somebody adds more legs into this test, but IMHO it would be
> cleaner instead of deleting, to replace the deleted element with a None.
> And then deleted[] is not even needed. (But then the loops below need an
> extra condition.)

I'd argue with the "by luck" but I see your point that it's suspicious,
I'll change it :)