[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230508165815.45602-1-kuniyu@amazon.com>
Date: Mon, 8 May 2023 09:58:15 -0700
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: "David S. Miller" <davem@...emloft.net>, Eric Dumazet
<edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni
<pabeni@...hat.com>
CC: Kuniyuki Iwashima <kuniyu@...zon.com>, Kuniyuki Iwashima
<kuni1840@...il.com>, <netdev@...r.kernel.org>, syzbot
<syzkaller@...glegroups.com>
Subject: [PATCH v2 net] net: Fix sk->sk_stamp race in sock_recv_cmsgs().
KCSAN found a data race in sock_recv_cmsgs() [0] where the read access
to sk->sk_stamp needs READ_ONCE().
Also, there is another race like below. If the torn load of the high
32-bits precedes WRITE_ONCE(sk, skb->tstamp) and later the written
lower 32-bits happens to match with SK_DEFAULT_STAMP, the final result
of sk->sk_stamp could be 0.
sock_recv_cmsgs() ioctl(SIOCGSTAMP) sock_recv_cmsgs()
| | |
|- if (sock_flag(sk, SOCK_TIMESTAMP)) |
| | |
| `- sock_set_flag(sk, SOCK_TIMESTAMP)
| |
| `- if (sock_flag(sk, SOCK_TIMESTAMP))
`- if (sk->sk_stamp == SK_DEFAULT_STAMP) `- sock_write_timestamp(sk, skb->tstamp)
`- sock_write_timestamp(sk, 0)
Even with READ_ONCE(), we could get the same result if READ_ONCE() precedes
WRITE_ONCE() because the SK_DEFAULT_STAMP check and WRITE_ONCE(sk_stamp, 0)
are not atomic.
Let's avoid the race by cmpxchg() on 64-bits architecture or seqlock on
32-bits machines.
[0]:
BUG: KCSAN: data-race in packet_recvmsg / packet_recvmsg
write (marked) to 0xffff88803c81f258 of 8 bytes by task 19171 on cpu 0:
sock_write_timestamp include/net/sock.h:2670 [inline]
sock_recv_cmsgs include/net/sock.h:2722 [inline]
packet_recvmsg+0xb97/0xd00 net/packet/af_packet.c:3489
sock_recvmsg_nosec net/socket.c:1019 [inline]
sock_recvmsg+0x11a/0x130 net/socket.c:1040
sock_read_iter+0x176/0x220 net/socket.c:1118
call_read_iter include/linux/fs.h:1845 [inline]
new_sync_read fs/read_write.c:389 [inline]
vfs_read+0x5e0/0x630 fs/read_write.c:470
ksys_read+0x163/0x1a0 fs/read_write.c:613
__do_sys_read fs/read_write.c:623 [inline]
__se_sys_read fs/read_write.c:621 [inline]
__x64_sys_read+0x41/0x50 fs/read_write.c:621
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x72/0xdc
read to 0xffff88803c81f258 of 8 bytes by task 19183 on cpu 1:
sock_recv_cmsgs include/net/sock.h:2721 [inline]
packet_recvmsg+0xb64/0xd00 net/packet/af_packet.c:3489
sock_recvmsg_nosec net/socket.c:1019 [inline]
sock_recvmsg+0x11a/0x130 net/socket.c:1040
sock_read_iter+0x176/0x220 net/socket.c:1118
call_read_iter include/linux/fs.h:1845 [inline]
new_sync_read fs/read_write.c:389 [inline]
vfs_read+0x5e0/0x630 fs/read_write.c:470
ksys_read+0x163/0x1a0 fs/read_write.c:613
__do_sys_read fs/read_write.c:623 [inline]
__se_sys_read fs/read_write.c:621 [inline]
__x64_sys_read+0x41/0x50 fs/read_write.c:621
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x72/0xdc
value changed: 0xffffffffc4653600 -> 0x0000000000000000
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 19183 Comm: syz-executor.5 Not tainted 6.3.0-rc7-02330-gca6270c12e20 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Fixes: 6c7c98bad488 ("sock: avoid dirtying sk_stamp, if possible")
Reported-by: syzbot <syzkaller@...glegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@...zon.com>
---
v2:
* Add Fixes tag
v1: https://lore.kernel.org/netdev/20230506022325.99106-1-kuniyu@amazon.com/
---
include/net/sock.h | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/include/net/sock.h b/include/net/sock.h
index 8b7ed7167243..c2a8b799283e 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2671,6 +2671,20 @@ static inline void sock_write_timestamp(struct sock *sk, ktime_t kt)
#endif
}
+#define SK_DEFAULT_STAMP (-1L * NSEC_PER_SEC)
+
+static inline void sock_zero_timestamp(struct sock *sk)
+{
+#if BITS_PER_LONG==32
+ write_seqlock(&sk->sk_stamp_seq);
+ if (sk->sk_stamp == SK_DEFAULT_STAMP)
+ sk->sk_stamp = 0;
+ write_sequnlock(&sk->sk_stamp_seq);
+#else
+ cmpxchg(&sk->sk_stamp, SK_DEFAULT_STAMP, 0);
+#endif
+}
+
void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
struct sk_buff *skb);
void __sock_recv_wifi_status(struct msghdr *msg, struct sock *sk,
@@ -2704,7 +2718,6 @@ sock_recv_timestamp(struct msghdr *msg, struct sock *sk, struct sk_buff *skb)
void __sock_recv_cmsgs(struct msghdr *msg, struct sock *sk,
struct sk_buff *skb);
-#define SK_DEFAULT_STAMP (-1L * NSEC_PER_SEC)
static inline void sock_recv_cmsgs(struct msghdr *msg, struct sock *sk,
struct sk_buff *skb)
{
@@ -2718,8 +2731,8 @@ static inline void sock_recv_cmsgs(struct msghdr *msg, struct sock *sk,
__sock_recv_cmsgs(msg, sk, skb);
else if (unlikely(sock_flag(sk, SOCK_TIMESTAMP)))
sock_write_timestamp(sk, skb->tstamp);
- else if (unlikely(sk->sk_stamp == SK_DEFAULT_STAMP))
- sock_write_timestamp(sk, 0);
+ else
+ sock_zero_timestamp(sk);
}
void __sock_tx_timestamp(__u16 tsflags, __u8 *tx_flags);
--
2.30.2
Powered by blists - more mailing lists