netdev - Re: [PATCH net] wireguard: Use tunnel helpers for decapsulating ECN markings

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHmME9oMjw6-vG1eSrvPoC51qFSZRf75DUin8to5vGr5RJjDuw@mail.gmail.com>
Date:   Mon, 27 Apr 2020 13:53:04 -0600
From:   "Jason A. Donenfeld" <Jason@...c4.com>
To:     Toke Høiland-Jørgensen <toke@...hat.com>
Cc:     David Miller <davem@...emloft.net>,
        Netdev <netdev@...r.kernel.org>,
        WireGuard mailing list <wireguard@...ts.zx2c4.com>,
        Olivier Tilmans <olivier.tilmans@...ia-bell-labs.com>,
        Dave Taht <dave.taht@...il.com>,
        "Rodney W . Grimes" <ietf@...rsh.dnsmgr.net>
Subject: Re: [PATCH net] wireguard: Use tunnel helpers for decapsulating ECN markings

Hey Toke,

Thanks for fixing this. I wasn't aware there was a newer ECN RFC. A
few comments below:

On Mon, Apr 27, 2020 at 8:47 AM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
> RFC6040 also recommends dropping packets on certain combinations of
> erroneous code points on the inner and outer packet headers which shouldn't
> appear in normal operation. The helper signals this by a return value > 1,
> so also add a handler for this case.

This worries me. In the old implementation, we propagate some outer
header data to the inner header, which is technically an authenticity
violation, but minor enough that we let it slide. This patch here
seems to make that violation a bit worse: namely, we're now changing
the behavior based on a combination of outer header + inner header. An
attacker can manipulate the outer header (set it to CE) in order to
learn whether the inner header was CE or not, based on whether or not
the packet gets dropped, which is often observable. That's some form
of an oracle, which I'm not too keen on having in wireguard. On the
other hand, we pretty much already _explicitly leak this bit_ on tx
side -- in send.c:

PACKET_CB(skb)->ds = ip_tunnel_ecn_encap(0, ip_hdr(skb), skb); // inner packet
...
wg_socket_send_skb_to_peer(peer, skb, PACKET_CB(skb)->ds); // outer packet

We considered that leak a-okay. But a decryption oracle seems slightly
worse than an explicit and intentional leak. But maybe not that much
worse.

I wanted to check with you: is the analysis above correct? And can you
somehow imagine the ==2 case leading to different behavior, in which
the packet isn't dropped? Or would that ruin the "[de]congestion" part
of ECN? I just want to make sure I understand the full picture before
moving in one direction or another.

> +               if (INET_ECN_decapsulate(skb, PACKET_CB(skb)->ds,
> +                                        ip_hdr(skb)->tos) > 1)
> +               if (INET_ECN_decapsulate(skb, PACKET_CB(skb)->ds,
> +                                        ipv6_get_dsfield(ipv6_hdr(skb))) > 1)

The documentation for the function says:

*  returns 0 on success
*          1 if something is broken and should be logged (!!! above)
*          2 if packet should be dropped

Would it be more clear to explicitly check for ==2 then?

> +ecn_decap_error:
> +       net_dbg_ratelimited("%s: Non-ECT packet from peer %llu (%pISpfsc)\n",
> +                           dev->name, peer->internal_id, &peer->endpoint.addr);

All the other error messages in this block are in the form of: "Packet
.* from peer %llu (%pISpfsc)\n", so better text here might be "Packet
is non-ECT from peer %llu (%pISpfsc)\n". However, do you think we
really need to log to the console for this? Does it add much in the
way of wireguard internals debugging? Can't network congestion induce
it in complicated scenarios? Maybe it'd be best just to increment
those error counters instead.

> +       ++dev->stats.rx_errors;
> +       ++dev->stats.rx_length_errors;

This should use stats.rx_frame_errors instead of length_errors, which
is also what net/ipv6/sit.c and drivers/net/geneve.c do on ECN-related
drops.

Thanks,
Jason