netdev - Re: [PATCH] selftests: net: ip_defrag: increase netdev_max

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d2dddb34-f126-81f8-cbf7-04635f04795a@gmail.com>
Date:   Fri, 6 Dec 2019 05:41:01 -0800
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     Thadeu Lima de Souza Cascardo <cascardo@...onical.com>,
        Eric Dumazet <eric.dumazet@...il.com>
Cc:     netdev@...r.kernel.org, davem@...emloft.net, shuah@...nel.org,
        linux-kselftest@...r.kernel.org, linux-kernel@...r.kernel.org,
        posk@...gle.com
Subject: Re: [PATCH] selftests: net: ip_defrag: increase netdev_max_backlog



On 12/6/19 4:17 AM, Thadeu Lima de Souza Cascardo wrote:
> On Wed, Dec 04, 2019 at 12:03:57PM -0800, Eric Dumazet wrote:
>>
>>
>> On 12/4/19 11:53 AM, Thadeu Lima de Souza Cascardo wrote:
>>> When using fragments with size 8 and payload larger than 8000, the backlog
>>> might fill up and packets will be dropped, causing the test to fail. This
>>> happens often enough when conntrack is on during the IPv6 test.
>>>
>>> As the larger payload in the test is 10000, using a backlog of 1250 allow
>>> the test to run repeatedly without failure. At least a 1000 runs were
>>> possible with no failures, when usually less than 50 runs were good enough
>>> for showing a failure.
>>>
>>> As netdev_max_backlog is not a pernet setting, this sets the backlog to
>>> 1000 during exit to prevent disturbing following tests.
>>>
>>
>> Hmmm... I would prefer not changing a global setting like that.
>> This is going to be flaky since we often run tests in parallel (using different netns)
>>
>> What about adding a small delay after each sent packet ?
>>
>> diff --git a/tools/testing/selftests/net/ip_defrag.c b/tools/testing/selftests/net/ip_defrag.c
>> index c0c9ecb891e1d78585e0db95fd8783be31bc563a..24d0723d2e7e9b94c3e365ee2ee30e9445deafa8 100644
>> --- a/tools/testing/selftests/net/ip_defrag.c
>> +++ b/tools/testing/selftests/net/ip_defrag.c
>> @@ -198,6 +198,7 @@ static void send_fragment(int fd_raw, struct sockaddr *addr, socklen_t alen,
>>                 error(1, 0, "send_fragment: %d vs %d", res, frag_len);
>>  
>>         frag_counter++;
>> +       usleep(1000);
>>  }
>>  
>>  static void send_udp_frags(int fd_raw, struct sockaddr *addr,
>>
> 
> That won't work because the issue only shows when we using conntrack, as the
> packet will be reassembled on output, then fragmented again. When this happens,
> the fragmentation code is transmitting the fragments in a tight loop, which
> floods the backlog.

Interesting !

So it looks like the test is correct, and exposed a long standing problem in this code.

We should not adjust the test to some kernel-of-the-day-constraints, and instead fix the kernel bug ;)

Where is this tight loop exactly ?

If this is feeding/bursting ~1000 skbs via netif_rx() in a BH context, maybe we need to call a variant
that allows immediate processing instead of (ab)using the softnet backlog.

Thanks !