lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0c6fc173-45c4-463f-bc0e-9fed8c3efc02@bytedance.com>
Date: Tue, 9 Apr 2024 16:57:23 -0700
From: Zijian Zhang <zijianzhang@...edance.com>
To: Eric Dumazet <edumazet@...gle.com>,
 Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: netdev@...r.kernel.org, davem@...emloft.net, kuba@...nel.org,
 cong.wang@...edance.com, xiaochun.lu@...edance.com
Subject: Re: [External] Re: [PATCH net-next 2/3] selftests: fix OOM problem in
 msg_zerocopy selftest

Firstly, thanks for your time and quick reply!

On 4/9/24 2:30 PM, Eric Dumazet wrote:
> On Tue, Apr 9, 2024 at 11:25 PM Willem de Bruijn
> <willemdebruijn.kernel@...il.com> wrote:
>>
>> zijianzhang@ wrote:
>>> From: Zijian Zhang <zijianzhang@...edance.com>
>>>
>>> In selftests/net/msg_zerocopy.c, it has a while loop keeps calling sendmsg
>>> on a socket, and it will recv the completion notifications when the socket
>>> is not writable. Typically, it will start the receiving process after
>>> around 30+ sendmsgs.
>>>
>>> However, because of the commit dfa2f0483360
>>> ("tcp: get rid of sysctl_tcp_adv_win_scale"), the sender is always writable
>>> and does not get any chance to run recv notifications. The selftest always
>>> exits with OUT_OF_MEMORY because the memory used by opt_skb exceeds
>>> the core.sysctl_optmem_max. We introduce "cfg_notification_limit" to
>>> force sender to receive notifications after some number of sendmsgs.
>>
>> No need for a new option. Existing test automation will not enable
>> that.
>>
>> I have not observed this behavior in tests (so I wonder what is
>> different about the setups). But it is fine to unconditionally force
>> a call to do_recv_completions every few sends.
> 
> Maybe their kernel does not have yet :
> 
> commit 4944566706b27918ca15eda913889db296792415    net: increase
> optmem_max default value
> 
> ???
> 

I did the selftest on a qemu vm with linux repo v6.8-rc3 kernel.
It should have the commit 4944566706b2 ("net: increase optmem_max
default value")

"
qemu-system-x86_64 \
     -enable-kvm \
     -nographic \
     -drive file=$HOME/guest.qcow2,if=virtio \
     -device vfio-pci,host=3b:00.2,multifunction=on \
     -m 32G \
     -smp 16 \
     -kernel $HOME/linux-master/arch/x86/boot/bzImage \
     -append 'root=/dev/vda1 console=ttyS0 earlyprintk=ttyS0 
net.ifnames=0 biosdevname=0'
"

I did it again just now with a clean image, and there was no problem...
Unfortunately, I did not save the image I tested before, I will give you
more information about my network configuration if I could restore it.

Thus, it is not a BUG, but a problem due to my custom conf, sorry about 
this, I will delete this patch in the next version.

>>
>>> Plus,
>>> in the selftest, we need to update skb_orphan_frags_rx to be the same as
>>> skb_orphan_frags.
>>
>> To be able to test over loopback, I suppose?
>>

Yes.

>>> In this case, for some reason, notifications do not
>>> come in order now. We introduce "cfg_notification_order_check" to
>>> possibly ignore the checking for order.
>>
>> Were you testing UDP?
>>
>> I don't think this is needed. I wonder what you were doing to see
>> enough of these events to want to suppress the log output.

I tested again on both TCP and UDP just now, and it happened to both of 
them. For tcp test, too many printfs will delay the sending and thus 
affect the throughput.

ipv4 tcp -z -t 1
gap: 277..277 does not append to 276
gap: 276..276 does not append to 278
gap: 278..1112 does not append to 277
gap: 1114..1114 does not append to 1113
gap: 1113..1113 does not append to 1115
gap: 1115..2330 does not append to 1114
gap: 2332..2332 does not append to 2331
gap: 2331..2331 does not append to 2333
gap: 2333..2559 does not append to 2332
gap: 2562..2562 does not append to 2560
...
gap: 25841..25841 does not append to 25843
gap: 25843..25997 does not append to 25842

...

ipv6 udp -z -t 1
gap: 11632..11687 does not append to 11625
gap: 11625..11631 does not append to 11688
gap: 11688..54662 does not append to 11632

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ