netdev - Discussion: Potential Hardening Ideas for ICMP Error Handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-Id: <20251109134600.292125-1-zhaoyz24@mails.tsinghua.edu.cn>
Date: Sun,  9 Nov 2025 21:46:00 +0800
From: Yizhou Zhao <zhaoyz24@...ls.tsinghua.edu.cn>
To: netdev@...r.kernel.org
Cc: davem@...emloft.net,
	dsahern@...nel.org,
	edumazet@...gle.com,
	kuba@...nel.org,
	pabeni@...hat.com,
	horms@...nel.org
Subject: Discussion: Potential Hardening Ideas for ICMP Error Handling

Dear netdev maintainers,

We previously shared some ICMP Error-related verification issues via
security@...nel.org. These issues make attackers able to spoof ICMP
Error packets and modify FNHE caches without strong validation. As 
these cases involve stateless protocols such as ICMP and UDP, it is 
inherently difficult to propose a complete or definitive fix. However,
in certain deployment scenarios these weaknesses can still be exploitable
in practice — for example by polluting routing or PMTU caches (leading
to unintended fragmentation behavior or route changes), or by leveraging
side channels to infer additional information.

Based on earlier discussions, we would like to share several potential
hardening ideas with the list for broader consideration.

**1. Handling of embedded ICMP packets in ICMP Fragmentation Needed**

>From earlier discussions, we revisited how ICMP Fragmentation Needed /
Packet Too Big messages embed an inner ICMP packet (most commonly Echo
Request/Reply). Echo Request may legitimately carry a payload for PMTU
probing and should continue to be handled accordingly. However, other
ICMP types — including Echo Reply — are short to exceed mtu, or passively 
generated and are not used for PMTU discovery, so embedded other types 
of packets should not trigger PMTU updates.

In testing, we also noticed that Linux currently validates an embedded
ping packet in Fragmentation Needed messages primarily by checking the
16-bit identifier. **Without correlating additional context (such as the
destination address of the original flow or the expected packet length)**,
this check can be ambiguous and may allow cache updates based on
insufficiently validated inputs.

One possible hardening direction is to require stronger correlation for
PMTU updates derived from embedded ICMP packets — for example, verifying
the original destination address or additional fields beyond the short
identifier — and ignoring embedded ICMP types that are never used for
PMTU probing.

**2. Deliver ICMP Errors only to connected (private) UDP sockets.**

Requiring a socket to be connected (peer 5-tuple known) forces an attacker
to first infer the peer port/address, raising the bar for off-path 
injection while preserving normal connected UDP use (DNS clients, RTP,
etc.).

**3. Ignore embedded ICMP packets in ICMP Redirect.**

Although Linux exposes accept_redirects to disable processing of
Redirect messages, this setting remains enabled by default on hosts.
This means that hosts may still update their next-hop selection based
on unauthenticated Redirects constructed from embedded packets whose
legitimacy cannot be reliably verified.

Even in environments where Redirects are used for local load balancing,
ICMP itself imposes **negligible bandwidth overhead**, so disabling or
constraining ICMP-triggered routing changes does not materially affect
traffic distribution. At the same time, the stateless nature of ICMP
makes Redirect messages particularly easy to spoof, and the kernel
currently has limited context available to validate them.

**4. Prevent raw sockets from processing ICMP errors**

In current code paths (e.g., icmp_socket_deliver() → raw_icmp_error()), 
raw socket error handling can end up calling the same routing update 
codepaths (FNHE updates) with very weak validation: essentially the 
existence of a raw socket matching IP/protocol is sufficient. This is
risky — tools or servers that open raw sockets (such as NMap) could be 
tricked into causing cache pollution. We propose in the current design,
raw sockets should not be allowed to trigger FNHE updates, **even if it 
is connected, since only IP addresses are checked, without further 
checks like port/sequence numbers**. Or maybe strong checks could be 
applied? But we don't have good ideas yet.

Yours Sincerely,
Yizhou Zhao