[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADHa2dBN+_9H-_L19Rg3vfvzC1N5FzG+rzW9Pff8pyM3KL9Ocw@mail.gmail.com>
Date: Mon, 9 Feb 2026 11:03:07 +0800
From: Yan Yan <evitayan@...gle.com>
To: antony.antony@...unet.com, Steffen Klassert <steffen.klassert@...unet.com>, paul@...ats.ca
Cc: netdev@...r.kernel.org, Herbert Xu <herbert@...dor.apana.org.au>,
"David S . Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, pabeni@...hat.com, horms@...nel.org,
saakashkumar@...vell.com, akamluddin@...vell.com,
Nathan Harold <nharold@...gle.com>, greg@...ah.com
Subject: Re: [REGRESSION] Discussion on "xfrm: Duplicate SPI Handling"
Thank you all for the reply! Please see my response inline below
> This is meant for multicast SAs, we don't support multicast SAs.
I reread the RFC and that makes sense to me. Thanks for the explanation.
> To clarify, how are larval outbound SAs created in Android?
> Are they created with zero or non-zero SPIs? Multiple zero SPIs (acquire states) are allowed.
In Android, larval outbound SAs are created with non-zero SPIs because
Android userspace uses XFRM_MSG_ALLOCSPI to manually reserve a unique
value before the SA is fully finalized. These are not kernel-generated
"acquire" states.
Regarding the default behavior:
I'm fine with defaulting to RFC 4301 for the upstream kernel to align
with the modern standard. As long as a toggle exists to fallback to
the legacy RFC 2401 behavior, Android’s compatibility requirements are
met.
On Mon, Feb 2, 2026 at 10:27 PM Antony Antony <antony.antony@...unet.com> wrote:
>
> Hi Yan,
>
> On Wed, Jan 28, 2026 at 16:43:06 +0800, Yan Yan wrote:
> > Hi all,
> >
> > I am writing because unfortunately commit: 94f39804d891 ("xfrm:
> > Duplicate SPI Handling") has caused a regression in the Android OS, so
> > we would like to gain some context to help determine next steps. The
> > issue is caused by the new requirement for global SPI uniqueness
> > during allocation. Based on our review of RFC 4301 and the previous
> > review history, we would like to highlight a few concerns:
> >
> > 1. Regression on Android:
> > This change breaks production behavior in Android, which uses
> > XFRM_MSG_ALLOCSPI to create larval SAs for both directions. Since RFC
> > 4301 allows duplicate outbound SPIs, and larval SAs are often created
>
> To clarify, how are larval outbound SAs created in Android?
> Are they created with zero or non-zero SPIs? Multiple zero SPIs (acquire states) are allowed. However, I guess there is no user API for managing these acquire states. Only the kernel can create them and handle expiration or deletion via SA updates with acq_req?
>
> A user API (UAPI) for creating and deleting acquire states might be a possible solution.
>
> I haven’t been able to consistently reproduce the issue Marvell reported,
> but I suspect the bug could also affect outbound SAs with non-zero SPIs.
> Also when one peer is behind NAT.
>
> I wonder wouldn't duplicate SPI when behind NAT cause issues for output SAs?
> Because the triplet is SPI, Protocol, Daddr. There is no dport in it.
>
> > before direction or selectors are finalized, the allocator must remain
> > permissive (at least in our current design).
> > This also aligns with a concern Herbert Xu raised during the initial
> > review regarding compatibility:
> > > "It's also dangerous to unilaterally do this since existing deployments
> > > could rely on the old behaviour. You'd need to add a toggle for
> > > compatibility."
> >
> > 2. Inbound SPI uniqueness should not be a hard requirement:
> > The justification for enforcing global SPI uniqueness often cites the
> > statement in RFC 4301, Section 4.1, that for unicast traffic, the SPI
> > "by itself suffices to specify an SA." However, we don’t think this
> > means inbound SPI uniqueness is a hard requirement because of the two
> > following reasons:
> >
> > – Another statement implies that SPI uniqueness is just an
> > implementation choice:
> > > "Each entry in the SA Database (SAD) MUST
> > > indicate whether the SA lookup makes use of the destination IP address, or the
> > > destination and source IP addresses, in addition to the SPI."
> >
> > – There is a "Longest Match" mandate which makes SPI uniqueness unnecessary:
> > > "For each inbound, IPsec-protected packet, an implementation MUST
> > > conduct its search of the SAD such that it finds the entry that
> > > matches the 'longest' SA identifier. In this context, if two or more
> > > SAD entries match based on the SPI value, then the entry that also
> > > matches based on destination address... is the 'longest' match."
> >
> > 3. Further clarification on the specific problem being addressed would
> > be helpful. The "real-world" problem this commit intends to fix
> > remains unclear. The patch mentions:
> > > "This behavior causes inconsistencies during SPI lookups for inbound packets.
> > > Since the lookup may return an arbitrary SA among those with the same SPI,
> > > packet processing can fail, resulting in packet drops."
> >
> > However, Linux kernel lookups using the triplet (SPI, Protocol, Daddr)
> > are deterministic. The lookup will not return an "arbitrary" SA
> > because the destination address is used to disambiguate the state.
> >
> > As Antony suggested, this change may cater to SPI-only hardware
> > indexing. If that is the case, we are concerned about applying such
> > hardware-specific limits to the software stack, especially if the
> > behavior is not opt-in, as it appears to require an overly-narrow
> > reading of the RFC 4301.
>
> I agree with your suggestion that making the behavior opt-in.
> I would prefer the Default : to allow duplicate.
>
> > Given these concerns, would it be possible to discuss a revert or,
> > alternatively, could further context be provided regarding the
> > specific real-world problem this commit was intended to address? Once
> > the underlying issue is clearly defined, we can work together to find
> > a backward-compatible solution that satisfies all requirements.
> >
> > Review threads are attached for easy reference:
> > https://lore.kernel.org/netdev/aDQhZ_ikHEt_pLn_@gondor.apana.org.au/T/#r45c1786651ce5af730f757aca7438474d494a323
> > https://lore.kernel.org/netdev/20250616100621.837472-1-saakashkumar@marvell.com/T/#u
>
> -antony
--
--
Best,
Yan
Powered by blists - more mailing lists