lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <929f7f20-d266-0cc6-fbba-4a77f5c1e306@mellanox.com>
Date:   Wed, 10 Jul 2019 09:36:41 +0000
From:   Maxim Mikityanskiy <maximmi@...lanox.com>
To:     Stefan Lippers-Hollmann <s.l-h@....de>
CC:     David Miller <davem@...emloft.net>,
        "jakub.kicinski@...ronome.com" <jakub.kicinski@...ronome.com>,
        "kuznet@....inr.ac.ru" <kuznet@....inr.ac.ru>,
        "yoshfuji@...ux-ipv6.org" <yoshfuji@...ux-ipv6.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Leon Romanovsky <leonro@...lanox.com>
Subject: Re: [PATCH net v2] Validate required parameters in
 inet6_validate_link_af

On 2019-07-09 04:11, Stefan Lippers-Hollmann wrote:
> Hi
> 
> On 2019-05-22, David Miller wrote:
>> From: Maxim Mikityanskiy <maximmi@...lanox.com>
>> Date: Tue, 21 May 2019 06:40:04 +0000
>>
>>> inet6_set_link_af requires that at least one of IFLA_INET6_TOKEN or
>>> IFLA_INET6_ADDR_GET_MODE is passed. If none of them is passed, it
>>> returns -EINVAL, which may cause do_setlink() to fail in the middle of
>>> processing other commands and give the following warning message:
>>>
>>>    A link change request failed with some changes committed already.
>>>    Interface eth0 may have been left with an inconsistent configuration,
>>>    please check.
>>>
>>> Check the presence of at least one of them in inet6_validate_link_af to
>>> detect invalid parameters at an early stage, before do_setlink does
>>> anything. Also validate the address generation mode at an early stage.
>>>
>>> Signed-off-by: Maxim Mikityanskiy <maximmi@...lanox.com>
>>
>> Applied, thank you.
> 
> After updating from kernel 5.1.16 to 5.2, I noticed that my
> systemd-networkd (241-5, Debian/unstable) managed bridges didn't
> come up and needed a manual "ip link set dev br-lan up" to get
> configured. Bisecting between v5.1 and v5.2 pointed to this
> patch and reverting just this change from v5.2 fixes the issue
> for me again.

This patch changes behavior only in case of invalid input. If the 
userspace sends a valid message over netlink, nothing changes after my 
patch. However, for some subset of invalid inputs, it used to be 
undefined behavior, and the kernel used to apply partial configuration 
before it noticed that the input was invalid and failed. After my patch, 
this subset of invalid inputs is handled properly, resulting in an 
immediate error returned to the userspace, and no configuration is 
affected. So, my patch is actually a bug fix.

Unfortunately, commit [1] introduced a regression in systemd, and it 
started sending invalid input to the kernel, apparently didn't pay 
attention to the error returned and relied on the undefined behavior 
(the partial configuration update that took place before my patch).

Later on, commit [2] was introduced, and it should fix that regression 
in systemd.

What you experience may be explained by this bug in systemd:

1. systemd broke, but the issue remained unnoticed, because some part of 
the configuration was still applied.

2. The bug in systemd was eventually fixed, but apparently you haven't 
updated to the version that has this fix yet.

3. My fix for the kernel was merged.

4. As you are using a systemd without fix, the issue led to more severe 
consequences, because now no configuration is applied after an invalid 
request from systemd.

I haven't tried to reproduce your configuration though, but I guess the 
things above are what has happened. I suggest you to update systemd to a 
version that has commit [2] (or to build it from master if no newer 
version has been released since then) - I hope it solves your issue. 
Otherwise, let me know.

[1]: 
https://github.com/systemd/systemd/commit/0e2fdb83bb5e22047e0c7cc058b415d0e93f02cf

[2]: 
https://github.com/systemd/systemd/commit/4d48747c43922250a62cf6e0ad9ee364665ef82d

> $ git bisect start
> # good: [e93c9c99a629c61837d5a7fc2120cd2b6c70dbdd] Linux 5.1
> git bisect good e93c9c99a629c61837d5a7fc2120cd2b6c70dbdd
> # bad: [46713c3d2f8da5e3d8ddd2249bcb1d9974fb5d28] Merge tag 'for-linus-20190706' of git://git.kernel.dk/linux-block
> git bisect bad 46713c3d2f8da5e3d8ddd2249bcb1d9974fb5d28
> # good: [a2d635decbfa9c1e4ae15cb05b68b2559f7f827c] Merge tag 'drm-next-2019-05-09' of git://anongit.freedesktop.org/drm/drm
> git bisect good a2d635decbfa9c1e4ae15cb05b68b2559f7f827c
> # good: [22c58fd70ca48a29505922b1563826593b08cc00] Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
> git bisect good 22c58fd70ca48a29505922b1563826593b08cc00
> # good: [61939b12dc24d0ac958020f261046c35a16e0c48] block: print offending values when cloned rq limits are exceeded
> git bisect good 61939b12dc24d0ac958020f261046c35a16e0c48
> # bad: [3510955b327176fd4cbab5baa75b449f077722a2] mm/list_lru.c: fix memory leak in __memcg_init_list_lru_node
> git bisect bad 3510955b327176fd4cbab5baa75b449f077722a2
> # bad: [30d1d92a888d03681b927c76a35181b4eed7071f] Merge tag 'nds32-for-linux-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux
> git bisect bad 30d1d92a888d03681b927c76a35181b4eed7071f
> # bad: [dbde71df810c62e72e2aa6d88a0686a6092956cd] Merge tag 'tty-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
> git bisect bad dbde71df810c62e72e2aa6d88a0686a6092956cd
> # bad: [100f6d8e09905c59be45b6316f8f369c0be1b2d8] net: correct zerocopy refcnt with udp MSG_MORE
> git bisect bad 100f6d8e09905c59be45b6316f8f369c0be1b2d8
> # bad: [4ca6dee5220fe2377bf12b354ef85978425c9ec7] dpaa2-eth: Make constant 64-bit long
> git bisect bad 4ca6dee5220fe2377bf12b354ef85978425c9ec7
> # bad: [b5730061d1056abf317caea823b94d6e12b5b4f6] cxgb4: offload VLAN flows regardless of VLAN ethtype
> git bisect bad b5730061d1056abf317caea823b94d6e12b5b4f6
> # bad: [c1e85c6ce57ef1eb73966152993a341c8123a8ea] net: macb: save/restore the remaining registers and features
> git bisect bad c1e85c6ce57ef1eb73966152993a341c8123a8ea
> # bad: [f42c104f2ec94a9255a835cd4cd1bd76279d4d06] Documentation: add TLS offload documentation
> git bisect bad f42c104f2ec94a9255a835cd4cd1bd76279d4d06
> # bad: [d008b3d2be4b00267e7af5c21269e7af4f65c6e2] mISDN: Fix indenting in dsp_cmx.c
> git bisect bad d008b3d2be4b00267e7af5c21269e7af4f65c6e2
> # bad: [40a1578d631a8ac1cf0ef797c435114107747859] ocelot: Dont allocate another multicast list, use __dev_mc_sync
> git bisect bad 40a1578d631a8ac1cf0ef797c435114107747859
> # bad: [7dc2bccab0ee37ac28096b8fcdc390a679a15841] Validate required parameters in inet6_validate_link_af
> git bisect bad 7dc2bccab0ee37ac28096b8fcdc390a679a15841
> # first bad commit: [7dc2bccab0ee37ac28096b8fcdc390a679a15841] Validate required parameters in inet6_validate_link_af
> 
> While I originally noticed this issue on real hardware (r8169, e1000,
> e1000e, e100, alx) and multiple systems with a slightly complex bridge
> setup, I can reproduce it with a very basic configuration under kvm
> (upon which all the tests below are based):
> 
> $ cat /etc/systemd/network/20-wired.network
> [Match]
> Name=ens4
> 
> [Network]
> DHCP=yes
> 
> (same results with just DHCP=ipv4)
> 
> With the above systemd-networkd configuration, the system comes up
> without network access:
> 
> # ip a
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
>      link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>      inet 127.0.0.1/8 scope host lo
>         valid_lft forever preferred_lft forever
>      inet6 ::1/128 scope host
>         valid_lft forever preferred_lft forever
> 2: ens4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
>      link/ether 00:16:3e:00:00:00 brd ff:ff:ff:ff:ff:ff
> 
> # networkctl | cat -
> IDX LINK             TYPE               OPERATIONAL SETUP
>    1 lo               loopback           carrier     unmanaged
>    2 ens4             ether              off         configuring
> 
> 2 links listed.
> 
> Manually enabling the interface does help:
> 
> # ip link set dev ens4 up
> 
> # ip a
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
>      link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>      inet 127.0.0.1/8 scope host lo
>         valid_lft forever preferred_lft forever
>      inet6 ::1/128 scope host
>         valid_lft forever preferred_lft forever
> 2: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>      link/ether 00:16:3e:00:00:00 brd ff:ff:ff:ff:ff:ff
>      inet 172.23.6.0/14 brd 172.23.255.255 scope global dynamic ens4
>         valid_lft 43199sec preferred_lft 43199sec
>      inet6 2003:xxxx:xxxx:xxxx::197/128 scope global tentative dynamic noprefixroute
>         valid_lft 13809sec preferred_lft 1209sec
>      inet6 fdxx:xxxx:xxxx::197/128 scope global tentative noprefixroute
>         valid_lft forever preferred_lft forever
>      inet6 fdxx:xxxx:xxxx:0:216:3eff:fe00:0/64 scope global tentative mngtmpaddr noprefixroute
>         valid_lft forever preferred_lft forever
>      inet6 2003:xxxx:xxxx:xxxx:216:3eff:fe00:0/64 scope global tentative dynamic mngtmpaddr noprefixroute
>         valid_lft 13809sec preferred_lft 1209sec
>      inet6 fe80::216:3eff:fe00:0/64 scope link
>         valid_lft forever preferred_lft forever
> 
> # networkctl | cat -
> IDX LINK             TYPE               OPERATIONAL SETUP
>    1 lo               loopback           carrier     unmanaged
>    2 ens4             ether              routable    configured
> 
> 2 links listed.
> 
> A quick test of upgrading all systemd packages to 242-2 from
> Debian/experimental shows the same issue; Debian 10/ buster (stable)
> is shipping with systemd 241-5.
> 
> DHCPv4 is served by a recent OpenWrt/ master snapshot on ipq8065/ nbg6817
> (ARMv7), using dnsmasq 2.80-13 and odhcpd-ipv6only 2019-05-17-41a74cba-3
> covering DHCPv6 and prefix delegation.
> 
> Attached are xz compressed versions of the kernel configuration (amd64),
> dmesg and journalctl output.
> 
> The Debian/unstable VM was started with qemu-kvm 1:3.1+dfsg-8 on a
> Debian/unstable host running kernel 5.2 with this patch reverted:
> 
> $ QEMU_AUDIO_DRV=pa qemu-system-x86_64 \
> 	-machine accel=kvm:tcg \
> 	-monitor stdio \
> 	-rtc base=localtime \
> 	-cpu qemu64,+vmx \
> 	-smp 3 \
> 	-m 4096 \
> 	-device virtio-gpu-pci \
> 	-device virtio-net-pci,mac=00:16:3E:00:00:00,netdev=tap-br-lan0 \
> 		-netdev tap,ifname=tap-br-lan0,script=no,id=tap-br-lan0 \
> 	-device AC97 \
> 	-drive file=/srv/storage/vm/linux.qcow2.img,if=none,discard=unmap,index=0,media=disk,id=hd0 \
> 		-device virtio-scsi-pci,id=scsi -device scsi-hd,drive=hd0 \
> 	-usb \
> 		-device usb-tablet \
> 		-device usb-ehci,id=ehci \
> 		-device nec-usb-xhci,id=xhci \
> 	-device virtio-rng-pci \
> 	-boot menu=on
> 
> Regards
> 	Stefan Lippers-Hollmann
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ