lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5481173A.9060308@smart-weblications.de>
Date:	Fri, 05 Dec 2014 03:23:54 +0100
From:	Smart Weblications GmbH - Florian Wiessner 
	<f.wiessner@...rt-weblications.de>
To:	Julian Anastasov <ja@....bg>,
	Steffen Klassert <steffen.klassert@...unet.com>
CC:	netdev@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
	stable@...r.kernel.org
Subject: Re: 3.12.33 - BUG xfrm_selector_match+0x25/0x2f6

Hi,

Am 05.12.2014 00:15, schrieb Julian Anastasov:
> 
> 	Hello,
> 
> On Thu, 4 Dec 2014, Steffen Klassert wrote:
> 
>>> [16623.096721] Call Trace:
>>> [16623.096744]  <IRQ>
>>> [16623.096749]  [<ffffffff81547a7c>] ? xfrm_sk_policy_lookup+0x44/0x9b
>>> [16623.096802]  [<ffffffff81547ef7>] ? xfrm_lookup+0x91/0x446
>>> [16623.096832]  [<ffffffff81541316>] ? ip_route_me_harder+0x150/0x1b0
>>> [16623.096865]  [<ffffffffa01b6457>] ? ip_vs_route_me_harder+0x86/0x91 [ip_vs]
>>> [16623.096899]  [<ffffffffa01b797a>] ? ip_vs_out+0x2d3/0x5bc [ip_vs]
>>> [16623.096930]  [<ffffffff81501420>] ? ip_rcv_finish+0x2b8/0x2b8
>>
>> I really wonder why the xfrm_sk_policy_lookup codepath is taken here.
>> It looks like this is the processing of an inbound ipv4 packet that
>> is going to be rerouted to the output path by ipvs, so this packet
>> should not have socket context at all.
> 
> 	In above trace looks like IPVS-NAT is used between
> local client and some real server. IPVS handles this skb
> at LOCAL_IN and calls ip_vs_route_me_harder(). If we have
> skb->sk at LOCAL_IN, my first thought is about early demux.
> 
> 	If I remember correctly, looking at commit f5a41847acc535e2
> ("ipvs: move ip_route_me_harder for ICMP") that introduced
> this rerouting (2.6.37), it was needed because at that time TCP
> used rt_src from received skb to select daddr in ip_send_reply().
> As packets to server are DNAT-ed and packets to client are
> SNAT-ed we used rerouting to fill rt_src with correct IP
> after SNAT.
> 
> 	Now when routing cache is removed in 3.6 and
> tcp_v4_send_reset() is changed to provide ip_hdr(skb)->saddr
> instead of rt_src it should be safe to remove this rerouting,
> it is enough that ip_hdr(skb)->saddr was updated on IPVS-SNAT at
> LOCAL_IN. In fact, rt_src was removed early in 3.0 with
> commit 0a5ebb8000c5362 ("ipv4: Pass explicit daddr arg to 
> ip_send_reply().").
> 
> 	This is only to explain above stack. Not sure
> if problem is related somehow to early demux but such
> commits look interesting:
> 
> - commit 6b8dbcf2c44fd7a ("bridge: netfilter: orphan skb before invoking 
> ip netfilter hooks")
> 
> 	Also, it would be good to know which 3.x kernel between
> 3.13 and 3.17 fixes the problem, it will narrow the search.
> 


i tried with 3.12.33 without any XFRM and now got this one (which is reproducable):

[  233.956012] BUG: unable to handle kernel NULL pointer dereference at 00000000
                                   00000014
[  233.956218] IP: [<ffffffffa013a470>] nf_ct_seqadj_set+0x60/0x90 [nf_conntrack
                                   ]
[  233.956371] PGD 0
[  233.956493] Oops: 0000 [#1] SMP
[  233.956680] Modules linked in: netconsole xt_nat xt_multiport veth iptable_ma
                                   ngle xt_mark nf_conntrack_netlink nfnetlink
ip_vs_rr ipt_MASQUERADE iptable_nat
nf_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_tcpudp iptable_filter
                                    ip_tables cpufreq_ondemand cpufreq_powersave
cpufreq_conservative cpufreq_users                                    pace
ocfs2_stack_o2cb ocfs2_dlm bridge stp llc bonding fuse nf_conntrack_ftp 802
                               1q openvswitch gre vxlan xt_conntrack x_tables
ocfs2_dlmfs dlm sctp ocfs2 ocfs2_                                    nodemanager
ocfs2_stackglue configfs rbd kvm_intel kvm coretemp ip_vs_ftp ip_vs
                        nf_nat nf_conntrack psmouse i2c_i801 serio_raw lpc_ich
mfd_core evdev btrfs lzo_                                    decompress lzo_compress
[  233.960221] CPU: 2 PID: 29996 Comm: vsftpd Not tainted 3.12.33 #4
[  233.960298] Hardware name: Supermicro X9SCI/X9SCA/X9SCI/X9SCA, BIOS 1.1a 09/2
                                   8/2011
[  233.960395] task: ffff88075e87a2c0 ti: ffff8806a7444000 task.ti: ffff8806a744
                                   4000
[  233.960486] RIP: 0010:[<ffffffffa013a470>]  [<ffffffffa013a470>] nf_ct_seqadj
                                   _set+0x60/0x90 [nf_conntrack]
[  233.960632] RSP: 0018:ffff88083fc83998  EFLAGS: 00010206
[  233.960709] RAX: 000000000000000c RBX: ffff8806cab452cc RCX: 0000000000000003
[  233.960791] RDX: 0000000000000029 RSI: 0000000000000003 RDI: ffff8806cab452cc
[  233.960875] RBP: 00000000ee38035a R08: ffff8807e2b1edc0 R09: ffff88083fc839a8
[  233.960957] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000003
[  233.961041] R13: 0000000000000000 R14: 0000000000000003 R15: ffff8806a75a50bc
[  233.961124] FS:  00007ff22daec700(0000) GS:ffff88083fc80000(0000) knlGS:00000
                                   00000000000
[  233.961226] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  233.961303] CR2: 0000000000000014 CR3: 00000006b3259000 CR4: 00000000000407e0
[  233.961384] Stack:
[  233.961460]  ffff880815612b60 0000000000000012 0000000000000014 ffff8806cab45
                                   2c8
[  233.961776]  ffff8806a75a5001 ffffffffa014f681 0000000000000000 ffffffff00000
                                   045
[  233.962095]  ffff880800000048 0000001b00000003 ffff88083fc83a70 ffff880815612
                                   b60
[  233.962411] Call Trace:
[  233.962482]  <IRQ>
[  233.962538]  [<ffffffffa014f681>] ? __nf_nat_mangle_tcp_packet+0x109/0x120 [n
                                   f_nat]
[  233.962762]  [<ffffffffa017749e>] ? ip_vs_ftp_out.part.8+0x2b2/0x338 [ip_vs_f
                                   tp]
[  233.962866]  [<ffffffff814cb8c0>] ? __domain_mapping+0x25d/0x2a3
[  233.962949]  [<ffffffff8154140c>] ? fib_table_lookup+0xe4/0x255
[  233.963032]  [<ffffffffa015f858>] ? ip_vs_app_pkt_out+0x105/0x18b [ip_vs]
[  233.963110]  [<ffffffffa0162ffc>] ? tcp_snat_handler+0x6b/0x320 [ip_vs]
[  233.963189]  [<ffffffffa0155d3d>] ? ip_vs_conn_out_get_proto+0x1c/0x25 [ip_vs
                                   ]
[  233.963284]  [<ffffffffa0158937>] ? ip_vs_out+0x290/0x5bc [ip_vs]
[  233.963362]  [<ffffffff8150f544>] ? ip_frag_mem+0x2a/0x2a
[  233.963442]  [<ffffffff81508e1f>] ? nf_iterate+0x42/0x80
[  233.963519]  [<ffffffff81508ec6>] ? nf_hook_slow+0x69/0xff
[  233.963595]  [<ffffffff8150f544>] ? ip_frag_mem+0x2a/0x2a
[  233.963667]  [<ffffffff8150f8ae>] ? ip_forward+0x22d/0x2cf
[  233.963744]  [<ffffffff814e57ce>] ? __netif_receive_skb_core+0x5f0/0x66c
[  233.963826]  [<ffffffff814e59df>] ? process_backlog+0x13e/0x13e
[  233.963911]  [<ffffffffa0455e09>] ? br_handle_frame_finish+0x382/0x382 [bridg
                                   e]
[  233.964008]  [<ffffffff814e5a2b>] ? netif_receive_skb+0x4c/0x7d
[  233.964090]  [<ffffffffa0455d95>] ? br_handle_frame_finish+0x30e/0x382 [bridg
                                   e]
[  233.964186]  [<ffffffffa0455fda>] ? br_handle_frame+0x1d1/0x217 [bridge]
[  233.964267]  [<ffffffff814e567d>] ? __netif_receive_skb_core+0x49f/0x66c
[  233.964350]  [<ffffffff814e592b>] ? process_backlog+0x8a/0x13e
[  233.964429]  [<ffffffff814e5c31>] ? net_rx_action+0xa2/0x1c0
[  233.964508]  [<ffffffff81047e2e>] ? __do_softirq+0xf6/0x24f
[  233.964588]  [<ffffffff8106cbfd>] ? account_system_time+0x10f/0x169
[  233.964669]  [<ffffffff815ad7dc>] ? call_softirq+0x1c/0x30
[  233.964743]  <EOI>
[  233.964801]  [<ffffffff8100464d>] ? do_softirq+0x2c/0x5f
[  233.965013]  [<ffffffff81047ca1>] ? local_bh_enable+0x67/0x85
[  233.965088]  [<ffffffff81511689>] ? ip_finish_output+0x2c9/0x322
[  233.965165]  [<ffffffff8151240a>] ? ip_queue_xmit+0x2b7/0x2f0
[  233.965239]  [<ffffffff81524772>] ? tcp_transmit_skb+0x6ef/0x755
[  233.965316]  [<ffffffff815250e8>] ? tcp_write_xmit+0x886/0x9cb
[  233.965391]  [<ffffffff8152527a>] ? __tcp_push_pending_frames+0x24/0x7e
[  233.965473]  [<ffffffff8151a33c>] ? tcp_sendmsg+0xa4c/0xbfc
[  233.965550]  [<ffffffff814d3477>] ? sock_aio_write+0xe3/0xfd
[  233.965631]  [<ffffffff81122f4d>] ? do_sync_write+0x59/0x79
[  233.965709]  [<ffffffff811239e3>] ? vfs_write+0xc4/0x182
[  233.965786]  [<ffffffff81123daf>] ? SyS_write+0x45/0x7c
[  233.965864]  [<ffffffff815ac35b>] ? tracesys+0xdd/0xe2
[  233.965940] Code: 68 14 4d 01 c5 45 85 e4 74 46 f0 80 4f 78 40 48 8d 5f 04 48
                                    89 df e8 00 12 47 e1 31 c0 41 83 fe 02 0f 97
c0 48 6b c0 0c 4c 01 e8 <8b> 70 08                                     39 70 04
74 08 89 ea 0f ca 39 10 79 0d 89 70 04 44 01
[  233.969602] RIP  [<ffffffffa013a470>] nf_ct_seqadj_set+0x60/0x90 [nf_conntrac
                                   k]
[  233.969746]  RSP <ffff88083fc83998>
[  233.969816] CR2: 0000000000000014
[  233.969919] ---[ end trace c6faf7aa989b11c2 ]---
[  233.969999] Kernel panic - not syncing: Fatal exception in interrupt
[  233.970081] Rebooting in 10 seconds..
[  244.029931] ACPI MEMORY or I/O RESET_REG.


node01:/ocfs2/usr/src/linux-3.12.33/scripts# ./decodecode < /tmp/oops-ipvsftp.txt
[ 233.965940] Code: 68 14 4d 01 c5 45 85 e4 74 46 f0 80 4f 78 40 48 8d 5f 04 48
89 df e8 00 12 47 e1 31 c0 41 83 fe 02 0f 97 c0 48 6b c0 0c 4c 01 e8 <8b> 70 08
39 70 04 74 08 89 ea 0f ca 39 10 79 0d 89 70 04 44 01
All code
========
   0:   68 14 4d 01 c5          pushq  $0xffffffffc5014d14
   5:   45 85 e4                test   %r12d,%r12d
   8:   74 46                   je     0x50
   a:   f0 80 4f 78 40          lock orb $0x40,0x78(%rdi)
   f:   48 8d 5f 04             lea    0x4(%rdi),%rbx
  13:   48 89 df                mov    %rbx,%rdi
  16:   e8 00 12 47 e1          callq  0xffffffffe147121b
  1b:   31 c0                   xor    %eax,%eax
  1d:   41 83 fe 02             cmp    $0x2,%r14d
  21:   0f 97 c0                seta   %al
  24:   48 6b c0 0c             imul   $0xc,%rax,%rax
  28:   4c 01 e8                add    %r13,%rax
  2b:*  8b 70 08                mov    0x8(%rax),%esi           <-- trapping
instruction
  2e:   39 70 04                cmp    %esi,0x4(%rax)
  31:   74 08                   je     0x3b
  33:   89 ea                   mov    %ebp,%edx
  35:   0f ca                   bswap  %edx
  37:   39 10                   cmp    %edx,(%rax)
  39:   79 0d                   jns    0x48
  3b:   89 70 04                mov    %esi,0x4(%rax)
  3e:   44                      rex.R
  3f:   01                      .byte 0x1

Code starting with the faulting instruction
===========================================
   0:   8b 70 08                mov    0x8(%rax),%esi
   3:   39 70 04                cmp    %esi,0x4(%rax)
   6:   74 08                   je     0x10
   8:   89 ea                   mov    %ebp,%edx
   a:   0f ca                   bswap  %edx
   c:   39 10                   cmp    %edx,(%rax)
   e:   79 0d                   jns    0x1d
  10:   89 70 04                mov    %esi,0x4(%rax)
  13:   44                      rex.R
  14:   01                      .byte 0x1


setup is like this:


#virtual=<myVIP>:21
#       real=10.10.1.20:21 masq
#       real=10.10.1.21:21 masq
#       real=10.10.1.22:21 masq
#       real=10.10.1.23:21 masq
#       persistent=600
#       service=ftp
#       scheduler=rr
#       protocol=tcp
#       checktype=connect

( i remarked it to prevent fruther crashes...)

when ip_vs_ftp is loaded and someone trying to make a ftp connection, the system
panics instantly.

10.10.1.20 - 10.10.1.23 are lxc-containers using veth connected to the bridge
running on 4 different nodes. The node running ldirector/ipvsadm has also one of
those containers running (don't know if that matters)

brctl show
bridge name     bridge id               STP enabled     interfaces
br0             8000.00259052bbf4       no              bond0
                                                        vethMKELUc
                                                        vethXdWGqf
                                                        vethgJMmEb
                                                        vethmKNqFc


I disabled the ftp server lxc container on the node doing ip_vs, so that the
endpoint of the connection is not on the same node and tried again but with the
same result.

Unfortunatelly i cannot test with newer kernels than 3.12, because ocfs2 is
somehow broken in >= 3.14


-- 

Mit freundlichen Grüßen,

Florian Wiessner

Smart Weblications GmbH
Martinsberger Str. 1
D-95119 Naila

fon.: +49 9282 9638 200
fax.: +49 9282 9638 205
24/7: +49 900 144 000 00 - 0,99 EUR/Min*
http://www.smart-weblications.de

--
Sitz der Gesellschaft: Naila
Geschäftsführer: Florian Wiessner
HRB-Nr.: HRB 3840 Amtsgericht Hof
*aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ