netdev - Re: [PATCH bpf-next 0/5] fix test

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a7a84334-b650-23e7-9df5-43a942ac9666@lab.ntt.co.jp>
Date:   Fri, 25 May 2018 17:28:13 +0900
From:   Prashant Bhole <bhole_prashant_q7@....ntt.co.jp>
To:     John Fastabend <john.fastabend@...il.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>
Cc:     "David S . Miller" <davem@...emloft.net>,
        Shuah Khan <shuah@...nel.org>, netdev@...r.kernel.org
Subject: Re: [PATCH bpf-next 0/5] fix test_sockmap



On 5/24/2018 1:58 PM, John Fastabend wrote:
> On 05/23/2018 09:47 PM, Prashant Bhole wrote:
>>
>>
>> On 5/23/2018 6:44 PM, Prashant Bhole wrote:
>>>
>>>
>>> On 5/22/2018 2:08 AM, John Fastabend wrote:
>>>> On 05/20/2018 10:13 PM, Prashant Bhole wrote:
>>>>>
>>>>>
>>>>> On 5/19/2018 1:42 AM, John Fastabend wrote:
>>>>>> On 05/18/2018 12:17 AM, Prashant Bhole wrote:
>>>>>>> This series fixes bugs in test_sockmap code. They weren't caught
>>>>>>> previously because failure in RX/TX thread was not notified to the
>>>>>>> main thread.
>>>>>>>
>>>>>>> Also fixed data verification logic and slightly improved test output
>>>>>>> such that parameters values (cork, apply, start, end) of failed test
>>>>>>> can be easily seen.
>>>>>>>
>>>>>>
>>>>>> Great, this was on my list so thanks for taking care of it.
>>>>>>
>>>>>>> Note: Even after fixing above problems there are issues with tests
>>>>>>> which set cork parameter. Tests fail (RX thread timeout) when cork
>>>>>>> value is non-zero and overall data sent by TX thread isn't multiples
>>>>>>> of cork value.
>>>>>>
>>>>>>
>>>>>> This is expected. When 'cork' is set the sender should only xmit
>>>>>> the data when 'cork' bytes are available. If the user doesn't
>>>>>> provide the N bytes the data is cork'ed waiting for the bytes and
>>>>>> if the socket is closed the state is cleaned up. What these tests
>>>>>> are testing is the cleanup path when a user doesn't provide the
>>>>>> N bytes. In practice this is used to validate headers and prevent
>>>>>> users from sending partial headers. We want to keep these tests because
>>>>>> they verify a tear-down path in the code.
>>>>>
>>>>> Ok.
>>>>>
>>>>>>
>>>>>> After your changes do these get reported as failures? If so we
>>>>>> need to account for the above in the calculations.
>>>>>
>>>>> Yes, cork related test are reported as failures because of RX thread
>>>>> timeout.
>>>>>
>>>>> So with your above description, I think we need to differentiate cork
>>>>> tests with partial data and full data. In partial data test we can have
>>>>> something like "timeout_expected" flag. Any other way to fix it?
>>>>>
>>>>
>>>> Adding a flag seems reasonable to me. Lets do this for now. Also I
>>>> plan to add more negative tests so we can either use the same
>>>> flag or a new one for those cases as well.
>>>>
>>>
>>> John,
>>> I worked on this for some time and noticed that the RX-timeout of
>>> tests with cork parameter is dependent on various parameters. So we
>>> can not set a flag like the way 'drop_expected' flag is set before
>>> executing the test.
>>>
>>> So I decided to write a function which judges all parameters before
>>> each test and decides whether a test with cork parameter will
>>> timeout or not. Then the conditions in the function became
>>> complicated. For example some tests fail if opt->rate < 17 (with
>>> some other conditions). Here is 17 is related to FRAGS_PER_SKB.
>>> Consider following two examples.
>> I'm sorry. Correction: s/FRAGS_PER_SKB/MAX_SKB_FRAGS/
>>
>>>
>>> ./test_sockmap --cgroup /mnt/cgroup2 -r 16 -i 1 -l 30 -t sendpage
>>> --txmsg --txmsg_cork 1024   # RX timeout occurs
>>>
>>> ./test_sockmap --cgroup /mnt/cgroup2 -r 17 -i 1 -l 30 -t sendpage
>>> --txmsg --txmsg_cork 1024   # Success!
>>>
> 
> Ah yes this hits the buffer limit and flushes the queue. The kernel
> side doesn't know how to merge those specific sendpage requests so
> it gives each request its own buffer and when the limit is reached
> we flush it.
> 
>>> Do we need to keep such tests? if yes, then I will continue with
>>> adding such conditions in the function.
>>>
> 
> Yes, these tests are needed because they are testing the edge cases.
> These are probably the most important tests because my normal usage
> will catch any issues in the "good" cases its these types of things
> that can go unnoticed (at least for a short while) if we don't have
> specific tests for them.

I tried but it is difficult to come up with a right set of conditions 
which lead to test failure.

-Prashant
> 
> Thanks for doing this.
> John