netdev - Re: [net-next v2] tcp: Explicitly mark reserved field in tcp_zerocopy

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Tue, 9 Feb 2021 21:35:20 -0700
From:   David Ahern <dsahern@...il.com>
To:     Arjun Roy <arjunroy@...gle.com>, Jakub Kicinski <kuba@...nel.org>
Cc:     Leon Romanovsky <leon@...nel.org>,
        Arjun Roy <arjunroy.kdev@...il.com>,
        David Miller <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>,
        Eric Dumazet <edumazet@...gle.com>,
        Soheil Hassas Yeganeh <soheil@...gle.com>
Subject: Re: [net-next v2] tcp: Explicitly mark reserved field in
 tcp_zerocopy_receive args.

On 2/9/21 4:46 PM, Arjun Roy wrote:
> On Tue, Feb 9, 2021 at 8:59 AM Jakub Kicinski <kuba@...nel.org> wrote:
>>
>> On Mon, 8 Feb 2021 20:20:29 -0700 David Ahern wrote:
>>> On 2/8/21 7:53 PM, Jakub Kicinski wrote:
>>>> On Mon, 8 Feb 2021 19:24:05 -0700 David Ahern wrote:
>>>>> That would be the case for new userspace on old kernel. Extending the
>>>>> check to the end of the struct would guarantee new userspace can not ask
>>>>> for something that the running kernel does not understand.
>>>>
>>>> Indeed, so we're agreeing that check_zeroed_user() is needed before
>>>> original optlen from user space gets truncated?
>>>
>>> I thought so, but maybe not. To think through this ...
>>>
>>> If current kernel understands a struct of size N, it can only copy that
>>> amount from user to kernel. Anything beyond is ignored in these
>>> multiplexed uAPIs, and that is where the new userspace on old kernel falls.
>>>
>>> Known value checks can only be done up to size N. In this case, the
>>> reserved field is at the end of the known struct size, so checking just
>>> the field is fine. Going beyond the reserved field has implications for
>>> extensions to the API which should be handled when those extensions are
>>> added.
>>
>> Let me try one last time.
>>
>> There is no check in the kernels that len <= N. User can pass any
>> length _already_. check_zeroed_user() forces the values beyond the
>> structure length to be known (0) rather than anything. It can only
>> avoid breakages in the future.
>>
>>> So, in short I think the "if (zc.reserved)" is correct as Leon noted.
>>
>> If it's correct to check some arbitrary part of the buffer is zeroed
>> it should be correct to check the entire tail is zeroed.
> 
> So, coming back to the thread, I think the following appears to be the
> current thoughts:
> 
> 1. It is requested that, on the kernel as it stands today, fields
> beyond zc.msg_flags (including zc.reserved, the only such field as of
> this patch) are zero'd out. So a new userspace asking to do specific
> things would fail on this old kernel with EINVAL. Old userspace would
> work on old or new kernels. New of course works on new kernels.
> 2. If it's correct to check some arbitrary field (zc.reserved) to be
> 0, then it should be fine to check this for all future fields >=
> reserved in the struct. So some advanced userspace down the line
> doesn't get confused.
> 
> Strictly speaking, I'm not convinced this is necessary - eg. 64 bytes
> struct right now, suppose userspace of the future gives us 96 bytes of
> which the last 32 are non-zero for some feature or the other. We, in
> the here and now kernel, truncate that length to 64 (as in we only
> copy to kernel those first 64 bytes) and set the returned length to
> 64. The understanding being, any (future, past or present) userspace
> consults the output value; and considers anything byte >= the returned
> len to be untouched by the kernel executing the call (ie. garbage,
> unacted upon).
> 
> So, how would this work for old+new userspace on old+new kernel?
> 
> A) old+old, new+new: sizes match, no issue
> B) new kernel, old userspace: That's not an issue. We have the
> switch(len) statement for that.
> C) old kernel, new userspace: that's the 96 vs. 64 B example above -
> new userspace would see that the kernel only operated on 64 B and
> treat the last 32 B as garbage/unacted on.
> 
> In this case, we would not give EINVAL on case C, as we would if we
> returned EINVAL on a check_zeroed_user() case for fields past
> zc.reserved. We'd do a zerocopy operating on just the features we know
> about, and communicate to the user that we only acted on features up
> until this byte offset.
> 
> Now, given this is the case, we still have the padding confusion with
> zc.reserved and the current struct size, so we have to force it to 0
> as we are doing. But I think we don't need to go beyond this so far.
> 
> Thus, my personal preference is to not have the check_zeroed_user()
> check. But if the consensus demands it, then it's an easy enough fix.
> What are your thoughts?
> 

bpf uses check_zeroed_user to make sure extensions to its structs are
compatible, so yes, this is required.

Also, you need to address legitimate msg_flags as I mentioned in another
response.