lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4e3c9f30-d43c-54b1-2796-86f38d316ef3@paneda.se>
Date:   Wed, 25 Nov 2020 18:12:34 +0100
From:   Thomas Karlsson <thomas.karlsson@...eda.se>
To:     Jakub Kicinski <kuba@...nel.org>
CC:     "davem@...emloft.net" <davem@...emloft.net>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: Hardcoded multicast queue length in macvlan.c driver causes poor
 multicast receive performance

On 2020-11-25 17:58, Jakub Kicinski wrote:
> On Wed, 25 Nov 2020 13:51:41 +0100 Thomas Karlsson wrote:
>> Den 2020-11-23 kl. 23:30, skrev Jakub Kicinski:
>>> On Mon, 23 Nov 2020 14:22:31 +0000 Thomas Karlsson wrote:  
>>>> Hello,
>>>>
>>>> There is a special queue handling in macvlan.c for broadcast and
>>>> multicast packages that was arbitrarily set to 1000 in commit
>>>> 07d92d5cc977a7fe1e683e1d4a6f723f7f2778cb . While this is probably
>>>> sufficient for most uses cases it is insufficient to support high
>>>> packet rates. I currently have a setup with 144 000 multicast packets
>>>> incoming per second (144 different live audio RTP streams) and suffer
>>>> very frequent packet loss. With unicast this is not an issue and I
>>>> can in addition to the 144kpps load the macvlan interface with
>>>> another 450mbit/s using iperf.
>>>>
>>>> In order to verify that the queue is the problem I edited the define
>>>> to 100000 and recompiled the kernel module. After replacing it with
>>>> rmmod/insmod I get 0 packet loss (measured over 2 days where I before
>>>> had losses every other second or so) and can also load an additional
>>>> 450 mbit/s multicast traffic using iperf without losses. So basically
>>>> no change in performance between unicast/multicast when it comes to
>>>> lost packets on my machine.
>>>>
>>>> I think It would be best if this queue length was configurable
>>>> somehow. Either an option when creating the macvlan (like how
>>>> bridge/passthrough/etc are set) or at least when loading the module
>>>> (for instance by using a config in /etc/modprobe.d). One size does
>>>> not fit all in this situation.  
>>>
>>> The former please. You can add a netlink attribute, should be
>>> reasonably straightforward. The other macvlan attrs are defined
>>> under "MACVLAN section" in if_link.h.
>>>   
>>
>> I did some work towards a patch using the first option,
>> by adding a netlink attribute in if_link.h as suggested.
>> I agree that this was reasonably straightforward, until userspace.
>>
>> In order to use/test my new parameter I need to update iproute2 package
>> as far as I understand. But then since I use the macvlan with docker
>> I also need to update the docker macvlan driver to send this new
>> option to the kernel module.
> 
> I wish I got a cookie every time someone said they can't do the right
> thing because they'd have to update $container_system 😔

lol :)

> 
>> For this reason I would like to know if you would consider
>> merging a patch using the module_param(...) variant instead?
>>
>> I would argue that this still makes the situation better
>> and resolves the packet-loss issue, although not necessarily
>> in an optimal way. However, The upside of being able to specify the
>> parameter on a per macvlan interface level instead of globally is not
>> that big in this situation. Normally you don't use that much
>> multicast anyway so it's a parameter that only will be touched by
>> a very small user base that can understand and handle the implications
>> of such a global setting.
> 
> How about implementing .changelink in macvlan? That way you could
> modify the macvlan device independent of Docker? 
> 
> Make sure you only accept changes to the bc queue len if that's the
> only one you act on.
> 

Hmm, I see. You mean that docker can create the interface and then I can
modify it afterwards? That might be a workaround but I just submitted
a patch (like seconds before your message) with the module_param() option
and this was very clean I think. both in how little code that needed to be
changed and in how simple it is to set the option in the target environment.

This is my first time ever attemting a contribution to the kernel so
I'm quite happy to keep it simple like that too :)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ