lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <cover.1717105215.git.yan@cloudflare.com>
Date: Thu, 30 May 2024 14:46:53 -0700
From: Yan Zhai <yan@...udflare.com>
To: netdev@...r.kernel.org
Cc: "David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Simon Horman <horms@...nel.org>, David Ahern <dsahern@...nel.org>,
	Abhishek Chauhan <quic_abchauha@...cinc.com>,
	Mina Almasry <almasrymina@...gle.com>,
	Florian Westphal <fw@...len.de>,
	Alexander Lobakin <aleksander.lobakin@...el.com>,
	David Howells <dhowells@...hat.com>, Jiri Pirko <jiri@...nulli.us>,
	Daniel Borkmann <daniel@...earbox.net>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	Lorenzo Bianconi <lorenzo@...nel.org>,
	Pavel Begunkov <asml.silence@...il.com>,
	linux-kernel@...r.kernel.org, kernel-team@...udflare.com,
	Jesper Dangaard Brouer <hawk@...nel.org>
Subject: [RFC net-next 0/6] net: pass receive socket to drop tracepoint

Greeting!

We set up our production packet drop monitoring around the kfree_skb
tracepoint. While this tracepoint is extremely valuable for diagnosing
critical problems, we find some limitation with drops on the local
receive path: this tracepoint can only inspect the dropped skb itself,
but such skb might not carry enough information to:

1. determine in which netns/container this skb gets dropped
2. determine by which socket/service this skb oughts to be received

The 1st issue is because skb->dev is the only member field with valid
netns reference. But skb->dev can get cleared or reused. For example,
tcp_v4_rcv will clear skb->dev and in later processing it might be reused
for OFO tree.

The 2nd issue is because there is no reference on an skb that reliably
points to a receiving socket. skb->sk usually points to the local
sending socket, and it only points to a receive socket briefly after
early demux stage, yet the socket can get stolen later. For certain drop
reason like TCP OFO_MERGE, Zerowindow, UDP at PROTO_MEM error, etc, it
is hard to infer which receiving socket is impacted. This cannot be
overcome by simply looking at the packet header, because of
complications like sk lookup programs. In the past, single purpose
tracepoints like trace_udp_fail_queue_rcv_skb, trace_sock_rcvqueue_full,
etc are added as needed to provide more visibility. This could be
handled in a more generic way.

In this change set we propose a new 'kfree_skb_for_sk' call as a drop-in
replacement for kfree_skb_reason at various local input path. It accepts
an extra receiving socket argument, and places the socket in skb->cb for
tracepoint consumption. With an rx socket, it can easily deal with both
issues above. Using cb field is more of a concern that a tracepoint
signature might be a part of stable ABI, but please advise if otherwise.

Yan Zhai (6):
  net: add kfree_skb_for_sk function
  ping: pass rx socket on rcv drops
  net: raw: pass rx socket on rcv drops
  tcp: pass rx socket on rcv drops
  udp: pass rx socket on rcv drops
  af_packet: pass rx socket on rcv drops

 include/linux/skbuff.h | 48 ++++++++++++++++++++++++++++++++++++++++--
 net/core/dev.c         | 21 +++++++-----------
 net/core/skbuff.c      | 29 +++++++++++++------------
 net/ipv4/ping.c        |  2 +-
 net/ipv4/raw.c         |  4 ++--
 net/ipv4/syncookies.c  |  2 +-
 net/ipv4/tcp_input.c   |  2 +-
 net/ipv4/tcp_ipv4.c    |  4 ++--
 net/ipv4/udp.c         |  6 +++---
 net/ipv6/raw.c         |  8 +++----
 net/ipv6/syncookies.c  |  2 +-
 net/ipv6/tcp_ipv6.c    |  4 ++--
 net/ipv6/udp.c         |  6 +++---
 net/packet/af_packet.c |  6 +++---
 14 files changed, 93 insertions(+), 51 deletions(-)

-- 
2.30.2



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ