lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240608090150.GR27689@kernel.org>
Date: Sat, 8 Jun 2024 10:01:50 +0100
From: Simon Horman <horms@...nel.org>
To: Petr Machata <petrm@...dia.com>
Cc: "David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	netdev@...r.kernel.org, Amit Cohen <amcohen@...dia.com>,
	Ido Schimmel <idosch@...dia.com>, Jiri Pirko <jiri@...nulli.us>,
	Alexander Zubkov <green@...tor.net>, mlxsw@...dia.com
Subject: Re: [PATCH net 5/6] mlxsw: spectrum_acl_erp: Fix object nesting
 warning

On Thu, Jun 06, 2024 at 04:49:42PM +0200, Petr Machata wrote:
> From: Ido Schimmel <idosch@...dia.com>
> 
> ACLs in Spectrum-2 and newer ASICs can reside in the algorithmic TCAM
> (A-TCAM) or in the ordinary circuit TCAM (C-TCAM). The former can
> contain more ACLs (i.e., tc filters), but the number of masks in each
> region (i.e., tc chain) is limited.
> 
> In order to mitigate the effects of the above limitation, the device
> allows filters to share a single mask if their masks only differ in up
> to 8 consecutive bits. For example, dst_ip/25 can be represented using
> dst_ip/24 with a delta of 1 bit. The C-TCAM does not have a limit on the
> number of masks being used (and therefore does not support mask
> aggregation), but can contain a limited number of filters.
> 
> The driver uses the "objagg" library to perform the mask aggregation by
> passing it objects that consist of the filter's mask and whether the
> filter is to be inserted into the A-TCAM or the C-TCAM since filters in
> different TCAMs cannot share a mask.
> 
> The set of created objects is dependent on the insertion order of the
> filters and is not necessarily optimal. Therefore, the driver will
> periodically ask the library to compute a more optimal set ("hints") by
> looking at all the existing objects.
> 
> When the library asks the driver whether two objects can be aggregated
> the driver only compares the provided masks and ignores the A-TCAM /
> C-TCAM indication. This is the right thing to do since the goal is to
> move as many filters as possible to the A-TCAM. The driver also forbids
> two identical masks from being aggregated since this can only happen if
> one was intentionally put in the C-TCAM to avoid a conflict in the
> A-TCAM.
> 
> The above can result in the following set of hints:
> 
> H1: {mask X, A-TCAM} -> H2: {mask Y, A-TCAM} // X is Y + delta
> H3: {mask Y, C-TCAM} -> H4: {mask Z, A-TCAM} // Y is Z + delta
> 
> After getting the hints from the library the driver will start migrating
> filters from one region to another while consulting the computed hints
> and instructing the device to perform a lookup in both regions during
> the transition.
> 
> Assuming a filter with mask X is being migrated into the A-TCAM in the
> new region, the hints lookup will return H1. Since H2 is the parent of
> H1, the library will try to find the object associated with it and
> create it if necessary in which case another hints lookup (recursive)
> will be performed. This hints lookup for {mask Y, A-TCAM} will either
> return H2 or H3 since the driver passes the library an object comparison
> function that ignores the A-TCAM / C-TCAM indication.
> 
> This can eventually lead to nested objects which are not supported by
> the library [1].
> 
> Fix by removing the object comparison function from both the driver and
> the library as the driver was the only user. That way the lookup will
> only return exact matches.
> 
> I do not have a reliable reproducer that can reproduce the issue in a
> timely manner, but before the fix the issue would reproduce in several
> minutes and with the fix it does not reproduce in over an hour.
> 
> Note that the current usefulness of the hints is limited because they
> include the C-TCAM indication and represent aggregation that cannot
> actually happen. This will be addressed in net-next.
> 
> [1]
> WARNING: CPU: 0 PID: 153 at lib/objagg.c:170 objagg_obj_parent_assign+0xb5/0xd0
> Modules linked in:
> CPU: 0 PID: 153 Comm: kworker/0:18 Not tainted 6.9.0-rc6-custom-g70fbc2c1c38b #42
> Hardware name: Mellanox Technologies Ltd. MSN3700C/VMOD0008, BIOS 5.11 10/10/2018
> Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
> RIP: 0010:objagg_obj_parent_assign+0xb5/0xd0
> [...]
> Call Trace:
>  <TASK>
>  __objagg_obj_get+0x2bb/0x580
>  objagg_obj_get+0xe/0x80
>  mlxsw_sp_acl_erp_mask_get+0xb5/0xf0
>  mlxsw_sp_acl_atcam_entry_add+0xe8/0x3c0
>  mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0
>  mlxsw_sp_acl_tcam_vchunk_migrate_one+0x16b/0x270
>  mlxsw_sp_acl_tcam_vregion_rehash_work+0xbe/0x510
>  process_one_work+0x151/0x370
> 
> Fixes: 9069a3817d82 ("lib: objagg: implement optimization hints assembly and use hints for object creation")
> Signed-off-by: Ido Schimmel <idosch@...dia.com>
> Reviewed-by: Amit Cohen <amcohen@...dia.com>
> Tested-by: Alexander Zubkov <green@...tor.net>
> Signed-off-by: Petr Machata <petrm@...dia.com>

Reviewed-by: Simon Horman <horms@...nel.org>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ