lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <8e6e5260-9359-eddd-c928-dba487f1319b@nokia.com>
Date:   Tue, 12 May 2020 16:23:11 +0200
From:   Alexander Sverdlin <alexander.sverdlin@...ia.com>
To:     netdev@...r.kernel.org
Cc:     "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Vlad Yasevich <vyasevich@...il.com>,
        Jiri Pirko <jpirko@...hat.com>, Arnd Bergmann <arnd@...db.de>,
        Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@...ia.com>
Subject: Multicast from underlying MACVLAN interface towards MACVLAN

Dear Network Core developers!

I've been debugging an issue with Multicast replies from underlying
interface of MACVLAN towards MACVLAN. These SKBs never contain a MAC header
and therefore cannot be properly processed by MACVLAN.

The usecase is following:
eth1 <-- eth1.212 <-- macvlan@...1.212 (in bridge mode)

As I understand the problem, it actually plays no role, that there is an intermediate VLAN interface.
The problem is, if macvlan@...1.212 sends Router Solicitation these SKBs are received on eth1.212,
but the corresponding multicast Router Advertisements are not received on macvlan@...1.212.

I've tracked the problem down to the following incompatibility between MACVLAN code and IP code...

One the one hand, MACVLAN always expects ethernet header:

static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb)                                                                                                          
{                                                                                                                                                                               
        struct macvlan_port *port;                                                                                                                                              
        struct sk_buff *skb = *pskb;                                                                                                                                            
        const struct ethhdr *eth = eth_hdr(skb);                                                                                                                                
        ...
                                                                                                                                                                                
        port = macvlan_port_get_rcu(skb->dev);                                                                                                                                  
        if (is_multicast_ether_addr(eth->h_dest)) {                                                                                                                             

One the other hand, IP doesn't populate ethernet header for multicast loopback transmission:

int dev_loopback_xmit(struct net *net, struct sock *sk, struct sk_buff *skb)                                                                                                    
{                                                                                                                                                                               
        skb_reset_mac_header(skb);                                                                                                                                              
        __skb_pull(skb, skb_network_offset(skb));                                                                                                                               
        skb->pkt_type = PACKET_LOOPBACK;                                                                                                                                        
        skb->ip_summed = CHECKSUM_UNNECESSARY;                                                                                                                                  
        WARN_ON(!skb_dst(skb));                                                                                                                                                 
        skb_dst_force(skb);                                                                                                                                                     
        netif_rx_ni(skb);                                                                                                                                                       

Unicast however works fine, because of:

int neigh_connected_output(struct neighbour *neigh, struct sk_buff *skb)                                                                                                        
{                                                                                                                                                                               
        struct net_device *dev = neigh->dev;                                                                                                                                    
        unsigned int seq;                                                                                                                                                       
        int err;                                                                                                                                                                
                                                                                                                                                                                
        do {                                                                                                                                                                    
                __skb_pull(skb, skb_network_offset(skb));                                                                                                                       
                seq = read_seqbegin(&neigh->ha_lock);                                                                                                                           
                err = dev_hard_header(skb, dev, ntohs(skb->protocol),                                                                                                           
                                      neigh->ha, NULL, skb->len);                                                                                                               
        } while (read_seqretry(&neigh->ha_lock, seq));                                                                                                                          
                                                                                                                                                                                
        if (err >= 0)                                                                                                                                                           
                err = dev_queue_xmit(skb);                                                                                                                                      

I've also collected some stack traces and SKB dumps to illustrate the problem
(I've instrumented macvlan_handle_frame() and eth_header() to understand when
the ethernet header has been generated):

macvlan_handle_frame() receives Router Advertisement, but cannot forward
without Ethernet header:

skb len=96 headroom=40 headlen=96 tailroom=56
mac=(40,0) net=(40,40) trans=80
shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
csum(0xae2e9a2f ip_summed=1 complete_sw=0 valid=0 level=0)
hash(0xc97ebd88 sw=1 l4=1) proto=0x86dd pkttype=5 iif=24
dev name=etha01.212 feat=0x0x0000000040005000
skb headroom: 00000000: 00 28 b3 4d 84 88 ff ff b2 72 b9 5e 00 00 00 00
skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
skb headroom: 00000020: 08 0f 00 00 00 00 00 00
skb linear:   00000000: 60 09 88 bd 00 38 3a ff fe 80 00 00 00 00 00 00
skb linear:   00000010: 00 40 43 ff fe 80 00 00 ff 02 00 00 00 00 00 00
skb linear:   00000020: 00 00 00 00 00 00 00 01 86 00 61 00 40 00 00 2d
skb linear:   00000030: 00 00 00 00 00 00 00 00 03 04 40 e0 00 00 01 2c
skb linear:   00000040: 00 00 00 78 00 00 00 00 fd 5f 42 68 23 87 a8 81
skb linear:   00000050: 00 00 00 00 00 00 00 00 01 01 02 40 43 80 00 00
skb tailroom: 00000000: 00 f0 01 00 00 00 00 00 a4 73 00 00 00 00 00 00
skb tailroom: 00000010: a4 73 00 00 00 00 00 00 00 10 00 00 00 00 00 00
skb tailroom: 00000020: 01 00 00 00 06 00 00 00 40 66 02 00 00 00 00 00
skb tailroom: 00000030: 40 76 02 00 00 00 00 00

Call Trace:
 <IRQ>
 dump_stack+0x69/0x9b
 macvlan_handle_frame+0x321/0x425 [macvlan]
 ? macvlan_forward_source+0x110/0x110 [macvlan]
 __netif_receive_skb_core+0x545/0xda0
 ? ip6_mc_input+0x103/0x250 [ipv6]
 ? ipv6_rcv+0xe1/0xf0 [ipv6]
 ? __netif_receive_skb_one_core+0x36/0x70
 __netif_receive_skb_one_core+0x36/0x70
 process_backlog+0x97/0x140
 net_rx_action+0x1eb/0x350
 __do_softirq+0xe3/0x383
 do_softirq_own_stack+0x2a/0x40
 </IRQ>
 do_softirq.part.4+0x4e/0x50
 netif_rx_ni+0x60/0xd0
 dev_loopback_xmit+0x83/0xf0
 ip6_finish_output2+0x575/0x590 [ipv6]
 ? ip6_cork_release.isra.1+0x64/0x90 [ipv6]
 ? __ip6_make_skb+0x38d/0x680 [ipv6]
 ? ip6_output+0x6c/0x140 [ipv6]
 ip6_output+0x6c/0x140 [ipv6]
 ip6_send_skb+0x1e/0x60 [ipv6]
 rawv6_sendmsg+0xc4b/0xe10 [ipv6]
 ? proc_put_long+0xd0/0xd0
 ? rw_copy_check_uvector+0x4e/0x110
 ? sock_sendmsg+0x36/0x40
 sock_sendmsg+0x36/0x40
 ___sys_sendmsg+0x2b6/0x2d0
 ? proc_dointvec+0x23/0x30
 ? addrconf_sysctl_forward+0x8d/0x250 [ipv6]
 ? dev_forward_change+0x130/0x130 [ipv6]
 ? _raw_spin_unlock+0x12/0x30
 ? proc_sys_call_handler.isra.14+0x9f/0x110
 ? __call_rcu+0x213/0x510
 ? get_max_files+0x10/0x10
 ? trace_hardirqs_on+0x2c/0xe0
 ? __sys_sendmsg+0x63/0xa0
 __sys_sendmsg+0x63/0xa0
 do_syscall_64+0x6c/0x1e0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Later when the same RA is being transmitted neigh_connected_output(), this is the first
time Ethernet header is being generated for this packet, but this is towards "world", not
the internal MACVLAN bridge:

skb len=110 headroom=26 headlen=110 tailroom=56
mac=(-1,-1) net=(40,40) trans=80
shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
csum(0xae2e9a2f ip_summed=0 complete_sw=0 valid=0 level=0)
hash(0xc97ebd88 sw=1 l4=1) proto=0x86dd pkttype=0 iif=0
dev name=etha01.212 feat=0x0x0000000040005000
sk family=10 type=3 proto=58
skb headroom: 00000000: 00 28 b3 4d 84 88 ff ff b2 72 b9 5e 00 00 00 00
skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00
skb linear:   00000000: 33 33 00 00 00 01 02 40 43 80 00 00 86 dd 60 09
skb linear:   00000010: 88 bd 00 38 3a ff fe 80 00 00 00 00 00 00 00 40
skb linear:   00000020: 43 ff fe 80 00 00 ff 02 00 00 00 00 00 00 00 00
skb linear:   00000030: 00 00 00 00 00 01 86 00 61 00 40 00 00 2d 00 00
skb linear:   00000040: 00 00 00 00 00 00 03 04 40 e0 00 00 01 2c 00 00
skb linear:   00000050: 00 78 00 00 00 00 fd 5f 42 68 23 87 a8 81 00 00
skb linear:   00000060: 00 00 00 00 00 00 01 01 02 40 43 80 00 00
skb tailroom: 00000000: 00 f0 01 00 00 00 00 00 a4 73 00 00 00 00 00 00
skb tailroom: 00000010: a4 73 00 00 00 00 00 00 00 10 00 00 00 00 00 00
skb tailroom: 00000020: 01 00 00 00 06 00 00 00 40 66 02 00 00 00 00 00
skb tailroom: 00000030: 40 76 02 00 00 00 00 00

Call Trace:
 dump_stack+0x69/0x9b
 debug_hdr+0x4c/0x60
 eth_header+0x71/0xe0
 vlan_dev_hard_header+0x58/0x140 [8021q]
 neigh_connected_output+0xa9/0x100
 ip6_finish_output2+0x24a/0x590 [ipv6]
 ? ip6_cork_release.isra.1+0x64/0x90 [ipv6]
 ? __ip6_make_skb+0x38d/0x680 [ipv6]
 ? ip6_output+0x6c/0x140 [ipv6]
 ip6_output+0x6c/0x140 [ipv6]
 ip6_send_skb+0x1e/0x60 [ipv6]
 rawv6_sendmsg+0xc4b/0xe10 [ipv6]
 ? proc_put_long+0xd0/0xd0
 ? rw_copy_check_uvector+0x4e/0x110
 ? sock_sendmsg+0x36/0x40
 sock_sendmsg+0x36/0x40
 ___sys_sendmsg+0x2b6/0x2d0
 ? proc_dointvec+0x23/0x30
 ? addrconf_sysctl_forward+0x8d/0x250 [ipv6]
 ? dev_forward_change+0x130/0x130 [ipv6]
 ? _raw_spin_unlock+0x12/0x30
 ? proc_sys_call_handler.isra.14+0x9f/0x110
 ? __call_rcu+0x213/0x510
 ? get_max_files+0x10/0x10
 ? trace_hardirqs_on+0x2c/0xe0
 ? __sys_sendmsg+0x63/0xa0
 __sys_sendmsg+0x63/0xa0
 do_syscall_64+0x6c/0x1e0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

I would appreciate any hint, how to approach this problem! I can try to come up with a patch,
but as this is so central thing in the IP protocol, I'd like to hear some opinions first...

-- 
Best regards,
Alexander Sverdlin.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ