lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <16469_1650744138_62645B4A_16469_436_1_4756cc37-340b-f2f6-e004-0d77573f33df@orange.com>
Date:   Sat, 23 Apr 2022 22:02:18 +0200
From:   <alexandre.ferrieux@...nge.com>
To:     <netdev@...r.kernel.org>
Subject: Zero-Day bug in VLAN offloading + cooked AF_PACKET

Hi,

I know the subject sounds like this belongs in libpcap bug reports; indeed it 
started there [1]. However, after some digging, it really looks like there's an 
issue in what the kernel itself provides.

TL;DR: outgoing VLAN-tagged traffic to non-offloaded interfaces is captured as 
corrupted in cooked mode, and has been so since at least 3.4...

One popular way of doing captures with libpcap-based tools like tcpdump, is the 
so-called "cooked mode". This is what you get with "tcpdump -i any". The kernel 
API used for this, documented in packet(7), is a socket of family AF_PACKET and 
protocol level SOCK_DGRAM. Contrarily to SOCK_RAW, SOCK_DGRAM provides a kind of 
"near L3" abstraction, stripping most of the L2 headers from the original 
packets. For example, when using recvmsg(),

  - the .msg_iov (main payload) of the recvmsg() is the packet starting at the 
L3 header
  - the .msg_name (aka "address") is a sockaddr_ll structure containing some L2 
information: ethertype, source MAC address.
  - the .msg_control (aka metadata, activated with PACKET_AUXDATA sockopt) may 
contain VLAN information: TCI, TPID.

All this works beautifully most of the time, with or without VLAN tags, as the 
ethertype is correctly extracted and conveyed in the sockaddr_ll. This allows 
any consumer of the L3 frame to decode it properly, knowing exactly wich L3 it's 
looking at.

However, there's a catch: for outgoing packets, *if* the interface has no 
hardware VLAN offloading, the ethertype gets overwritten by ... the TPID 
(0x8100). As a result, a consumer of the L3 frame has absolutely no way to 
recover its type.

As a demo, here is what the venerable "tcpdump -i any" says of an outgoing ARP 
packet on VLAN interface eth0.24, after VLAN offloading has been disabled via 
"ethtool -K". Two lines are generated, as the packet is seen on both eth0.24 
(first line) and eth0 (second line):

  15:06:37.681328 ARP, Request who-has 1.0.24.3 tell 1.0.24.1, length 28
  15:06:37.681336 ethertype IPv4, IP0

The first line is correct, as the frame is captured before handling by the 8021q 
module. The second is not !!

This is the result of the ethertype being overwritten. The actual value is 
0x8100, which tcpdump decodes as a 802.1Q TPID, thus shifting the L3 beginning 
by 4 bytes, ending up seeing a nonsensical "IPv0" frame.

To prove that this is *not* an issue in libpcap or tcpdump, here are the three 
aforementioned pieces of the packet, gotten by a simple test program doing 
recvmsg() on an AF_PACKET+SOCK_DGRAM capture socket:

  On VLAN interface eth0.24: (the "^^^^" show the ethertype's position)
  --------------------------

   - metadata:     107:8:010000001c0000001c0000000000000000000000
   - sockaddr_ll:  1100080606000000010004060025903285a70000
                       ^^^^
   - L3 frame:     00010800060400010025903285a70100180100000000000001001803

  On parent interface eth0:
  -------------------------

   - metadata:     107:8:010000001c0000001c0000000000000000000000
   - sockaddr_ll:  1100810004000000010004060025903285a70000
                       ^^^^
   - L3 frame:     00010800060400010025903285a70100180100000000000001001803

As is clear above, the second instance contains no trace of the original ARP 
ethertype 0x0806.

By contrast, if we re-enable VLAN offloading,

    - the first instance (on subinterface) is unchanged
    - the second instance (on parent interface) is back to normal, with a 
correct ARP ethertype (^^^^=0806) *and* VLAN info in the metadata (TCI-TPID, 
byte-swapped =1800,0081):

  On parent interface eth0:
  -------------------------

   - metadata:     107:8:510000001c0000001c0000000000000018000081
                                                         TCI-TPID
   - sockaddr_ll:  1100080604000000010004060025903285a70000
                       ^^^^
   - L3 frame:     00010800060400010025903285a70100180100000000000001001803

And sure enough, tcpdump is happy again:

  21:44:18.481331 ARP, Request who-has 1.0.24.3 tell 1.0.24.1, length 28
  21:44:18.481338 ethertype ARP, ARP, Request who-has 1.0.24.3 tell 1.0.24.1, 
length 28

I have found this bug active on an old machine with kernel 3.4.
In the URL below you'll find more details on ftrace-based evidence, hinting at 
the 8021q module.
However, I am *not* familiar enough with the Linux network stack (and special 
cases like offloading) to suggest a fix, sorry.
I hope a knowledgeable person will consider this nasty enough to deserve their 
attention.

Thanks in advance !

-Alex


[1] https://github.com/the-tcpdump-group/libpcap/issues/1105

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ