[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220315053040.70545-1-kuniyu@amazon.co.jp>
Date: Tue, 15 Mar 2022 14:30:40 +0900
From: Kuniyuki Iwashima <kuniyu@...zon.co.jp>
To: <kuniyu@...zon.co.jp>
CC: <davem@...emloft.net>, <eric.dumazet@...il.com>, <kuba@...nel.org>,
<kuni1840@...il.com>, <netdev@...r.kernel.org>,
<rao.shoaib@...cle.com>
Subject: Re: [PATCH net] af_unix: Support POLLPRI for OOB.
From: Kuniyuki Iwashima <kuniyu@...zon.co.jp>
Date: Tue, 15 Mar 2022 09:45:03 +0900
> From: Eric Dumazet <eric.dumazet@...il.com>
> Date: Mon, 14 Mar 2022 17:26:54 -0700
>> On 3/14/22 11:10, Shoaib Rao wrote:
>>>
>>> On 3/14/22 10:42, Eric Dumazet wrote:
>>>>
>>>> On 3/13/22 22:21, Kuniyuki Iwashima wrote:
>>>>> The commit 314001f0bf92 ("af_unix: Add OOB support") introduced OOB for
>>>>> AF_UNIX, but it lacks some changes for POLLPRI. Let's add the missing
>>>>> piece.
>>>>>
>>>>> In the selftest, normal datagrams are sent followed by OOB data, so
>>>>> this
>>>>> commit replaces `POLLIN | POLLPRI` with just `POLLPRI` in the first
>>>>> test
>>>>> case.
>>>>>
>>>>> Fixes: 314001f0bf92 ("af_unix: Add OOB support")
>>>>> Signed-off-by: Kuniyuki Iwashima <kuniyu@...zon.co.jp>
>>>>> ---
>>>>> net/unix/af_unix.c | 2 ++
>>>>> tools/testing/selftests/net/af_unix/test_unix_oob.c | 6 +++---
>>>>> 2 files changed, 5 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
>>>>> index c19569819866..711d21b1c3e1 100644
>>>>> --- a/net/unix/af_unix.c
>>>>> +++ b/net/unix/af_unix.c
>>>>> @@ -3139,6 +3139,8 @@ static __poll_t unix_poll(struct file *file,
>>>>> struct socket *sock, poll_table *wa
>>>>> mask |= EPOLLIN | EPOLLRDNORM;
>>>>> if (sk_is_readable(sk))
>>>>> mask |= EPOLLIN | EPOLLRDNORM;
>>>>> + if (unix_sk(sk)->oob_skb)
>>>>> + mask |= EPOLLPRI;
>>>>
>>>>
>>>> This adds another data-race, maybe add something to avoid another
>>>> syzbot report ?
>>>
>>> It's not obvious to me how there would be a race as it is just a check.
>>>
>>
>> KCSAN will detect that whenever unix_poll() reads oob_skb,
>>
>> its value can be changed by another cpu.
>>
>>
>> unix_poll() runs without holding a lock.
>
> Thanks for pointing out!
> So, READ_ONCE() is necessary here, right?
> oob_skb is written under the lock, so I think there is no paired
> WRITE_ONCE(), but is it ok?
I've tested the prog below and KCSAN repoted the race.
Also, READ_ONCE() suppressed it.
Thank you Eric!
I'll post v2 with READ_ONCE().
---8<---
[ 60.021825] ==================================================================
[ 60.021999] BUG: KCSAN: data-race in unix_poll / unix_stream_sendmsg
[ 60.021999]
[ 60.021999] write to 0xffff8880050d9ff0 of 8 bytes by task 175 on cpu 3:
[ 60.021999] unix_stream_sendmsg+0x9dc/0xbb0
[ 60.021999] sock_sendmsg+0x90/0xa0
[ 60.021999] __sys_sendto+0x138/0x190
[ 60.021999] __x64_sys_sendto+0x6d/0x80
[ 60.021999] do_syscall_64+0x38/0x80
[ 60.021999] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 60.021999]
[ 60.021999] read to 0xffff8880050d9ff0 of 8 bytes by task 176 on cpu 2:
[ 60.021999] unix_poll+0xf4/0x1b0
[ 60.021999] sock_poll+0xa4/0x1e0
[ 60.021999] do_sys_poll+0x326/0x750
[ 60.021999] __x64_sys_poll+0x55/0x1f0
[ 60.021999] do_syscall_64+0x38/0x80
[ 60.021999] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 60.021999]
[ 60.021999] value changed: 0xffff8880056bf000 -> 0xffff8880056bff00
[ 60.021999]
[ 60.021999] Reported by Kernel Concurrency Sanitizer on:
[ 60.021999] CPU: 2 PID: 176 Comm: unix_race_oob Not tainted 5.17.0-rc5-59531-gbdaabdfaadf8-dirty #9
[ 60.021999] Hardware name: Red Hat KVM, BIOS 1.11.0-2.amzn2 04/01/2014
[ 60.021999] ==================================================================
---8<---
---8<---
#include <errno.h>
#include <poll.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <sys/un.h>
#include <sys/wait.h>
#define offsetof(type, member) ((size_t)&((type *)0)->member)
#define SUNADDR "\0test\0"
#define ADDRLEN (socklen_t)(offsetof(struct sockaddr_un, sun_path) + 6)
int setup_server(void)
{
struct sockaddr_un addr = {
.sun_family = AF_UNIX,
.sun_path = SUNADDR,
};
int err, fd;
fd = socket(AF_UNIX, SOCK_STREAM | SOCK_NONBLOCK, 0);
if (fd == -1) {
perror("socket");
goto out;
}
err = bind(fd, (struct sockaddr *)&addr, ADDRLEN);
if (err == -1) {
perror("bind");
goto err;
}
err = listen(fd, 32);
if (err == -1) {
perror("listen");
goto err;
}
out:
return fd;
err:
close(fd);
return err;
}
int setup_client(void)
{
struct sockaddr_un addr = {
.sun_family = AF_UNIX,
.sun_path = SUNADDR,
};
int err, fd;
fd = socket(AF_UNIX, SOCK_STREAM | SOCK_NONBLOCK, 0);
if (fd == -1) {
perror("socket");
goto out;
}
err = connect(fd, (struct sockaddr *)&addr, ADDRLEN);
if (err == -1) {
perror("connect");
goto err;
}
out:
return fd;
err:
close(fd);
return err;
}
int setup_child(int server)
{
struct sockaddr_un addr;
socklen_t len;
int child;
child = accept(server, (struct sockaddr *)&addr, &len);
if (child == -1)
perror("accept");
return child;
}
int sender(int client)
{
char c = 'a';
int ret;
printf("start sender\n");
while (1) {
ret = send(client, &c, sizeof(c), MSG_OOB);
if (ret != 1 || errno)
perror("send");
}
return 0;
}
int receiver(int child)
{
struct pollfd pfds[1];
char buf[1024];
int ret;
pfds[0].fd = child;
pfds[0].events = POLLIN | POLLPRI;
printf("start receiver\n");
while (1) {
poll(pfds, 1, -1);
ret = recv(child, buf, sizeof(buf), MSG_OOB);
if (ret < 0 || errno)
perror("recv (MSG_OOB)");
ret = recv(child, buf, sizeof(buf), 0);
if (ret < 0 || errno)
perror("recv");
}
return 0;
}
int main(void)
{
int server, client, child;
int status;
pid_t pid;
server = setup_server();
if (server == -1)
goto out;
client = setup_client();
if (client == -1)
goto close_server;
child = setup_child(server);
if (child == -1)
goto close_client;
pid = fork();
if (pid == 0)
return sender(client);
pid = fork();
if (pid == 0)
return receiver(child);
wait(&status);
close(child);
close_client:
close(client);
close_server:
close(server);
out:
return 0;
}
---8<---
Powered by blists - more mailing lists