netdev - Re: [iproute PATCH] ip-route: Propagate errors from parse_one

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <bc845fc9-d7e7-11d2-c821-21851183d326@bouttier.eu>
Date:   Thu, 8 Feb 2018 14:26:05 +0100
From:   Élie Bouttier <elie@...ttier.eu>
To:     Stephen Hemminger <stephen@...workplumber.org>,
        Phil Sutter <phil@....cc>
Cc:     netdev@...r.kernel.org
Subject: Re: [iproute PATCH] ip-route: Propagate errors from parse_one_nh()

On 24/01/2018 16:44, Stephen Hemminger wrote:
> On Wed, 24 Jan 2018 10:19:24 +0100
> Phil Sutter <phil@....cc> wrote:
> 
>> Hi Stephen,
>>
>> On Tue, Jan 23, 2018 at 02:44:42PM -0800, Stephen Hemminger wrote:
>>> On Tue, 23 Jan 2018 17:40:47 +0100
>>> Phil Sutter <phil@....cc> wrote:
>>>   
>>>> The following command segfaults if enp0s31f6 does not exist:
>>>>
>>>> | # ip -6 route add default proto ra metric 20100 \
>>>> | 	nexthop via fe80:52:0:2040::1fc dev enp0s31f6 weight 1 \
>>>> | 	nexthop via fe80:52:0:2040::1fe dev enp0s31f6 weight 1
>>>>
>>>> Since the non-zero return code from parse_one_nh() is ignored,
>>>> parse_nexthops() continues iterating over the the same fields in argv
>>>> until buffer space is exhausted and eventually accesses unallocated
>>>> memory.
>>>>
>>>> Fix this by aborting on error in parse_nexthops() and make
>>>> iproute_modify() fail if parse_nexthops() did.
>>>>
>>>> Reported-by: Lennart Poettering <lpoetter@...hat.com>
>>>> Fixes: 2f406f2d0b4ef ("ip route: replace exits with returns")
>>>> Signed-off-by: Phil Sutter <phil@....cc>
>>>> ---
>>>>  ip/iproute.c | 7 ++++---
>>>>  1 file changed, 4 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/ip/iproute.c b/ip/iproute.c
>>>> index bf886fda9d761..d7accf57ac8d1 100644
>>>> --- a/ip/iproute.c
>>>> +++ b/ip/iproute.c
>>>> @@ -871,7 +871,8 @@ static int parse_nexthops(struct nlmsghdr *n, struct rtmsg *r,
>>>>  		memset(rtnh, 0, sizeof(*rtnh));
>>>>  		rtnh->rtnh_len = sizeof(*rtnh);
>>>>  		rta->rta_len += rtnh->rtnh_len;
>>>> -		parse_one_nh(n, r, rta, rtnh, &argc, &argv);
>>>> +		if (parse_one_nh(n, r, rta, rtnh, &argc, &argv) < 0)
>>>> +			return -1;
>>>>  		rtnh = RTNH_NEXT(rtnh);
>>>>  	}
>>>>  
>>>> @@ -1318,8 +1319,8 @@ static int iproute_modify(int cmd, unsigned int flags, int argc, char **argv)
>>>>  		addattr_l(&req.n, sizeof(req), RTA_METRICS, RTA_DATA(mxrta), RTA_PAYLOAD(mxrta));
>>>>  	}
>>>>  
>>>> -	if (nhs_ok)
>>>> -		parse_nexthops(&req.n, &req.r, argc, argv);
>>>> +	if (nhs_ok && parse_nexthops(&req.n, &req.r, argc, argv) < 0)
>>>> +		return -1;
>>>>  
>>>>  	if (req.r.rtm_family == AF_UNSPEC)
>>>>  		req.r.rtm_family = AF_INET;  
>>>
>>>
>>> The real issue is that handling of invalid device is different than all the other
>>> possible semantic errors.
>>>
>>> My recommendations are:
>>> 	* change bad device to use invarg() which does exit
>>> 	* make functions that only return 0 void including
>>> 		parse_one_nh
>>> 		lwt_parse_encap
>>> 		get_addr
>>>
>>> Also, it looks like read_family converts any address family it doesn't know about to unspec
>>> that is stupid behavior as well.
>>>
>>> The original commit 2f406f2d0b4ef ("ip route: replace exits with returns")
>>> looks like well intentioned but suspect. Most of the errors in ip route
>>> indicate real issues where continuing is not a good plan.  
>>
>> You're right, the use of invarg() for any other error effectively
>> prevents what said commit tried to achieve, so my fix is pretty
>> pointless in that regard. Yet I wonder why we still have 'ip -batch
>> -force' given that it's not useful. Maybe Élie is able to provide some
>> details about the use-case said commit tried to fix?
>>
>> Meanwhile I'll prepare some patches to address the shortcomings you
>> mentioned above.
> 
> The use case for batch (and force) is that there may be a large set of routes
> or qdisc operations where it is ok for some of them to fail because of responses
> from the kernel failing.  I don't think batch should ever just continue if handed
> invalid syntax for device or address. There are some borderline cases, for example
> if a tunnel device could not be created and later steps depend on that name.
> 
> Agree, lets get some real data on why the original patch was done.

If I remember correctly, I made this commit for a batch of "route get"
and not stop on inexistent routes. But my patch was not limited to this
use case and I tried to fix others potential situations where we should
not stop. The change I made in parse_one_nh is not directly related to
my use care, sorry for the introduced regression, I should have been
more careful.

Ihmo, if a tunnel device could not be created, later steps depending on
it will fail too, but we potentially still want subsequent operations
(for instance the creation of a second tunnel) to be done. I you don't
want it, don't use -force or make two batch files. From this point of
view, Phil patch is better than invarg() but I'm fine with invarg() too.


Thank you all and sorry again,

Élie