[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3a38fda6-eeae-8d6c-3eac-112abfc63015@gmail.com>
Date: Wed, 7 Sep 2022 20:58:27 +0300
From: Leonard Crestez <cdleonard@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: David Ahern <dsahern@...nel.org>,
Dmitry Safonov <0x7f454c46@...il.com>,
Francesco Ruggeri <fruggeri@...sta.com>,
Salam Noureddine <noureddine@...sta.com>,
Philip Paeps <philip@...uble.is>,
Shuah Khan <shuah@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Herbert Xu <herbert@...dor.apana.org.au>,
Kuniyuki Iwashima <kuniyu@...zon.co.jp>,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
Jakub Kicinski <kuba@...nel.org>,
Yuchung Cheng <ycheng@...gle.com>,
Mat Martineau <mathew.j.martineau@...ux.intel.com>,
Christoph Paasch <cpaasch@...le.com>,
Ivan Delalande <colona@...sta.com>,
Caowangbao <caowangbao@...wei.com>,
Priyaranjan Jha <priyarjha@...gle.com>,
netdev <netdev@...r.kernel.org>,
"open list:HARDWARE RANDOM NUMBER GENERATOR CORE"
<linux-crypto@...r.kernel.org>,
"open list:KERNEL SELFTEST FRAMEWORK"
<linux-kselftest@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v8 08/26] tcp: authopt: Disable via sysctl by default
On 9/7/22 20:04, Eric Dumazet wrote:
> On Wed, Sep 7, 2022 at 9:53 AM Leonard Crestez <cdleonard@...il.com> wrote:
>> On 9/7/22 02:11, Eric Dumazet wrote:
>>> On Mon, Sep 5, 2022 at 12:06 AM Leonard Crestez <cdleonard@...il.com> wrote:
>>>>
>>>> This is mainly intended to protect against local privilege escalations
>>>> through a rarely used feature so it is deliberately not namespaced.
>>>>
>>>> Enforcement is only at the setsockopt level, this should be enough to
>>>> ensure that the tcp_authopt_needed static key never turns on.
>>>>
>>>> No effort is made to handle disabling when the feature is already in
>>>> use.
>>>>
>>>> Signed-off-by: Leonard Crestez <cdleonard@...il.com>
>>>> ---
>>>> Documentation/networking/ip-sysctl.rst | 6 ++++
>>>> include/net/tcp_authopt.h | 1 +
>>>> net/ipv4/sysctl_net_ipv4.c | 39 ++++++++++++++++++++++++++
>>>> net/ipv4/tcp_authopt.c | 25 +++++++++++++++++
>>>> 4 files changed, 71 insertions(+)
>>>>
>>>> diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
>>>> index a759872a2883..41be0e69d767 100644
>>>> --- a/Documentation/networking/ip-sysctl.rst
>>>> +++ b/Documentation/networking/ip-sysctl.rst
>>>> @@ -1038,10 +1038,16 @@ tcp_challenge_ack_limit - INTEGER
>>>> Note that this per netns rate limit can allow some side channel
>>>> attacks and probably should not be enabled.
>>>> TCP stack implements per TCP socket limits anyway.
>>>> Default: INT_MAX (unlimited)
>>>>
>>>> +tcp_authopt - BOOLEAN
>>>> + Enable the TCP Authentication Option (RFC5925), a replacement for TCP
>>>> + MD5 Signatures (RFC2835).
>>>> +
>>>> + Default: 0
>>>> +
>>
>> ...
>>
>>>> +#ifdef CONFIG_TCP_AUTHOPT
>>>> +static int proc_tcp_authopt(struct ctl_table *ctl,
>>>> + int write, void *buffer, size_t *lenp,
>>>> + loff_t *ppos)
>>>> +{
>>>> + int val = sysctl_tcp_authopt;
>>>
>>> val = READ_ONCE(sysctl_tcp_authopt);
>>>
>>>> + struct ctl_table tmp = {
>>>> + .data = &val,
>>>> + .mode = ctl->mode,
>>>> + .maxlen = sizeof(val),
>>>> + .extra1 = SYSCTL_ZERO,
>>>> + .extra2 = SYSCTL_ONE,
>>>> + };
>>>> + int err;
>>>> +
>>>> + err = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos);
>>>> + if (err)
>>>> + return err;
>>>> + if (sysctl_tcp_authopt && !val) {
>>>
>>> READ_ONCE(sysctl_tcp_authopt)
>>>
>>> Note that this test would still be racy, because another cpu might
>>> change sysctl_tcp_authopt right after the read.
>>
>> What meaningful races are possible here? This is a variable that changes
>> from 0 to 1 at most once.
>
> Two cpus can issue writes of 0 and 1 values at the same time.
>
> Depending on scheduling writing the 0 can 'win' the race and overwrite
> the value back to 0.
>
> This is in clear violation of the claim you are making (that the
> sysctl can only go once from 0 to 1)
Not clear why anyone would attempt to write 0, maybe to ensure that it's
still disabled?
But you're right that userspace CAN do that and the kernel CAN misbehave
in this scenario so it would be better to just make the changes you
suggested.
>> In theory if two processes attempt to assign "non-zero" at the same time
>> then one will "win" and the other will get an error but races between
>> userspace writing different values are possible for any sysctl. The
>> solution seems to be "write sysctls from a single place".
>>
>> All the checks are in sockopts - in theory if the sysctl is written on
>> one CPU then a sockopt can still fail on another CPU until caches are
>> flushed. Is this what you're worried about?
>>
>> In theory doing READ_ONCE might incur a slight penalty on sockopt but
>> not noticeable.
>
> Not at all. There is _no_ penalty using READ_ONCE(). Unless it is done
> in a loop and this prevents some compiler optimization.
>
> Please use WRITE_ONCE() and READ_ONCE() for all sysctl values used in
> TCP stack (and elsewhere)
>
> See all the silly patches we had recently.
OK
Powered by blists - more mailing lists