[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <39ea7c85-c235-bc49-cd49-a2d7633eda4c@alu.unizg.hr>
Date: Mon, 14 Aug 2023 10:54:56 +0200
From: Mirsad Todorovac <mirsad.todorovac@....unizg.hr>
To: Kuniyuki Iwashima <kuniyu@...zon.com>
Cc: alexander@...alicyn.com, davem@...emloft.net, edumazet@...gle.com,
fw@...len.de, kuba@...nel.org, linux-kernel@...r.kernel.org,
linux-kselftest@...r.kernel.org, netdev@...r.kernel.org, pabeni@...hat.com,
shuah@...nel.org
Subject: Re: selftests: net/af_unix test_unix_oob [FAILED]
On 8/8/23 10:53, Mirsad Todorovac wrote:
> On 8/8/23 01:09, Mirsad Todorovac wrote:
>> On 8/7/23 22:46, Kuniyuki Iwashima wrote:
>>> From: Mirsad Todorovac <mirsad.todorovac@....unizg.hr>
>>> Date: Mon, 7 Aug 2023 21:44:41 +0200
>>>> Hi all,
>>>>
>>>> In the kernel 6.5-rc5 build on Ubuntu 22.04 LTS (jammy jellyfish) on a Ryzen 7950 assembled box,
>>>> vanilla torvalds tree kernel, the test test_unix_oob unexpectedly fails:
>>>>
>>>> # selftests: net/af_unix: test_unix_oob
>>>> # Test 2 failed, sigurg 23 len 63 OOB %
>>>>
>>>> It is this code:
>>>>
>>>> /* Test 2:
>>>> * Verify that the first OOB is over written by
>>>> * the 2nd one and the first OOB is returned as
>>>> * part of the read, and sigurg is received.
>>>> */
>>>> wait_for_data(pfd, POLLIN | POLLPRI);
>>>> len = 0;
>>>> while (len < 70)
>>>> len = recv(pfd, buf, 1024, MSG_PEEK);
>>>> len = read_data(pfd, buf, 1024);
>>>> read_oob(pfd, &oob);
>>>> if (!signal_recvd || len != 127 || oob != '#') {
>>>> fprintf(stderr, "Test 2 failed, sigurg %d len %d OOB %c\n",
>>>> signal_recvd, len, oob);
>>>> die(1);
>>>> }
>>>>
>>>> In 6.5-rc4, this test was OK, so it might mean we have a regression?
>>>
>>> Thanks for reporting.
>>>
>>> I confirmed the test doesn't fail on net-next at least, but it's based
>>> on v6.5-rc4.
>>>
>>> ---8<---
>>> [root@...alhost ~]# ./test_unix_oob
>>> [root@...alhost ~]# echo $?
>>> 0
>>> [root@...alhost ~]# uname -r
>>> 6.5.0-rc4-01192-g66244337512f
>>> ---8<---
>>>
>>> I'll check 6.5-rc5 later.
>>
>> Hi, Kuniyuki,
>>
>> It seems that there is a new development. I could reproduce the error with the failed test 2
>> as early as 6.0-rc1. However, the gotcha is that the error appears to be sporadically manifested
>> (possibly a race)?
>>
>> I am currently attempting a bisect.
>
> Bisect had shown that the condition existed already at 5.11 torvalds tree.
>
> It has to do with the configs chosen (I used the configs from seltests/*/config merged), but it
> is also present in the Ubuntu production build:
>
> marvin@...iant:~$ cd linux/kernel/linux_torvalds
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> Test 2 failed, sigurg 23 len 63 OOB %
> marvin@...iant:~/linux/kernel/linux_torvalds$ uname -rms
> Linux 6.4.8-060408-generic x86_64
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..1000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> Test 1 failed sigurg 0 len 63
> marvin@...iant:~/linux/kernel/linux_torvalds$
>
> It happens on rare occasions, so it seems to be a hard-to-spot race.
>
> Normal test running test_unix_oob once never noticed that, save by accident, which brought the problem to attention ...
>
> However, the problem seems to be config-driven rather than kernel-version-driven.
>
> marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..100000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
> Test 3.1 Inline failed, len 1 oob % atmark 0
> Test 1 Inline failed, sigurg 0 len 63
> Test 1 Inline failed, sigurg 0 len 63
> Test 1 Inline failed, sigurg 0 len 63
> Test 2 Inline failed, len 63 atmark 1
> Test 3 Inline failed, sigurg 23 len 63 data x
> Test 3 Inline failed, sigurg 23 len 63 data x
> Test 3 Inline failed, sigurg 23 len 63 data x
> Test 3 Inline failed, sigurg 23 len 63 data x
> Test 2 Inline failed, len 63 atmark 1
> Test 3.1 Inline failed, len 1 oob % atmark 0
> Test 2 failed, sigurg 23 len 63 OOB %
> marvin@...iant:~/linux/kernel/linux_torvalds$ uname -rms
> Linux 6.5.0-060500rc4-generic x86_64
> marvin@...iant:~/linux/kernel/linux_torvalds$
>
> At moments, I was able to reproduce with certain configs, but now something odd happens.
>
> I will keep investigating.
Please not that the bug persisted in 6.5-rc6:
marvin@...iant:~/linux/kernel/linux_torvalds$ for a in {0..100000}; do !!; done
for a in {0..100000}; do tools/testing/selftests/net/af_unix/test_unix_oob ; done
Test 2 failed, sigurg 23 len 63 OOB %
Test 2 Inline failed, len 63 atmark 1
Test 3 Inline failed, sigurg 23 len 63 data x
Test 2 failed, sigurg 23 len 63 OOB %
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3 Inline failed, sigurg 23 len 63 data x
Test 1 Inline failed, sigurg 0 len 63
Test 1 Inline failed, sigurg 0 len 63
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 1 Inline failed, sigurg 0 len 63
Test 2 failed, sigurg 23 len 63 OOB %
Test 1 Inline failed, sigurg 0 len 63
Test 2 failed, sigurg 23 len 63 OOB %
Test 3.1 Inline failed, len 1 oob % atmark 0
Test 3.1 Inline failed, len 1 oob % atmark 0
marvin@...iant:~/linux/kernel/linux_torvalds$
The bug can be triggered as a non-privileged user, but is not clear whether it is exploitable to elevate privileges.
Best regards,
Mirsad Todorovac
Powered by blists - more mailing lists