[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <323df302-aa17-df40-846d-3354d4bb126a@mellanox.com>
Date: Thu, 13 Jun 2019 14:13:22 +0000
From: Maxim Mikityanskiy <maximmi@...lanox.com>
To: Jonas Bonn <jonas@...rbonn.se>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
CC: "David S . Miller" <davem@...emloft.net>,
Alexey Kuznetsov <kuznet@....inr.ac.ru>,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>
Subject: Re: [PATCH 1/1] Address regression in inet6_validate_link_af
On 2019-06-13 09:45, Jonas Bonn wrote:
> Hi Max,
>
> On 12/06/2019 12:42, Maxim Mikityanskiy wrote:
>> On 2019-06-11 13:03, Jonas Bonn wrote:
>>> Patch 7dc2bccab0ee37ac28096b8fcdc390a679a15841 introduces a regression
>>> with systemd 241. In that revision, systemd-networkd fails to pass the
>>> required flags early enough. This appears to be addressed in later
>>> versions of systemd, but for users of version 241 where systemd-networkd
>>> nonetheless worked with earlier kernels, the strict check introduced by
>>> the patch causes a regression in behaviour.
>>>
>>> This patch converts the failure to supply the required flags from an
>>> error into a warning.
>> The purpose of my patch was to prevent a partial configuration update on
>> invalid input. -EINVAL was returned both before and after my patch, the
>> difference is that before my patch there was a partial update and a
>> warning.
>>
>> Your patch basically makes mine pointless, because you revert the fix,
>> and now we'll have the same partial update and two warnings.
>
> Unfortunately, yes...
So what you propose is a degradation.
>>
>> One more thing is that after applying your patch on top of mine, the
>> kernel won't return -EINVAL anymore on invalid input. Returning -EINVAL
>> is what happened before my patch, and also after my patch.
>
> Yes, you're right, it would probably be better revert the entire patch
> because the checks in set_link_af have been dropped on the assumption
> that validate_link_af catches the badness.
We shouldn't introduce workarounds in the kernel for some temporary bugs
in old userspace. A regression was introduced in systemd: it started
sending invalid messages, didn't check the return code properly and
relied on an undefined behavior. It was then fixed in systemd. If the
kernel had all kinds of workarounds for all buggy software ever existed,
it would be a complete mess. The software used the API in a wrong way,
the expected behavior is failure, so it shouldn't expect anything else.
For me, the trade-off between fixing the kernel behavior and supporting
some old buggy software while keeping a UB in the kernel forever has an
obvious resolution.
>>
>> Regarding the systemd issue, I don't think we should change the kernel
>> to adapt to bugs in systemd. systemd didn't have this bug from day one,
>> it was a regression introduced in [1]. The kernel has always returned
>> -EINVAL here, but the behavior before my patch was basically a UB, and
>> after the patch it's well-defined. If systemd saw EINVAL and relied on
>> the UB that came with it, it can't be a reason enough to break the
>> kernel.
>>
>> Moreover, the bug looks fixed in systemd's master, so what you suggest
>> is to insert a kernel-side workaround for an old version of software
>> when there is a fixed one.
>
> I agree, systemd is buggy here. Probably what happens is:
>
> i) systemd tries to set the link up
> ii) it ends up doing a "partial" modification of the link state;
> critically, though, enough to actually effect the link state changing to UP
> iii) systemd sees the -EINVAL error and decides it probably failed to
> bring up the link
> iv) systemd then gets a notification that the link is up and runs a
> DHCP client on it
>
> I haven't noticed any "partial modification" warnings in the kernel log
> but I wasn't looking for them, either...
>
> With the new behaviour in 5.2, step ii) above results in no "partial
> modification" so the link remains down and systemd is forever unable to
> bring it up.
>
> Anyway, for the record, the error is:
>
> systemd: Could not bring up interface... (invalid parameter)
>
> And the solution is: Linux >= 5.2 requires systemd != v241.
>
> If nobody else notices, that's good enough for me.
>
>>
>> Please correct me if anything I say is wrong.
>
> Nothing wrong, but it's still a regression.
>
> /Jonas
>
>>
>> Thanks,
>> Max
>>
>> [1]:
>> https://github.com/systemd/systemd/commit/0e2fdb83bb5e22047e0c7cc058b415d0e93f02cf
>>
>>
>>> With this, systemd-networkd version 241 once
>>> again is able to bring up the link, albeit not quite as intended and
>>> thereby with a warning in the kernel log.
>>>
>>> CC: Maxim Mikityanskiy <maximmi@...lanox.com>
>>> CC: David S. Miller <davem@...emloft.net>
>>> CC: Alexey Kuznetsov <kuznet@....inr.ac.ru>
>>> CC: Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>
>>> Signed-off-by: Jonas Bonn <jonas@...rbonn.se>
>>> ---
>>> net/ipv6/addrconf.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
>>> index 081bb517e40d..e2477bf92e12 100644
>>> --- a/net/ipv6/addrconf.c
>>> +++ b/net/ipv6/addrconf.c
>>> @@ -5696,7 +5696,8 @@ static int inet6_validate_link_af(const struct
>>> net_device *dev,
>>> return err;
>>> if (!tb[IFLA_INET6_TOKEN] && !tb[IFLA_INET6_ADDR_GEN_MODE])
>>> - return -EINVAL;
>>> + net_warn_ratelimited(
>>> + "required link flag omitted: TOKEN/ADDR_GEN_MODE\n");
>>> if (tb[IFLA_INET6_ADDR_GEN_MODE]) {
>>> u8 mode = nla_get_u8(tb[IFLA_INET6_ADDR_GEN_MODE]);
>>>
>>
Powered by blists - more mailing lists