lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 11 Nov 2007 01:39:03 +0000 (GMT)
From:	Chazarain Guillaume <guichaz@...oo.fr>
To:	Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>,
	David Miller <davem@...emloft.net>
Cc:	Netdev <netdev@...r.kernel.org>
Subject: Re : Oops preceded by WARNING: at net/ipv4/tcp_input.c:1571 tcp_remove_reno_sacks()

Hello Ilpo, thanks a lot for your investigation

> Do you have GSO enabled?

According to ethtool -k, no.

> Is this reproducable?

Unfortunately not, I saw it only once.

> You can try to provoke it by setting tcp_sack
 sysctl 
>  to 0 as this seems to be non-SACK related... If so, you could try the 
> debug patch below

> # CONFIG_DEBUG_LIST is not set

I'm currently running bittorrent with all of this, I just saw this (for the first time ever),
but otherwise it works fine:

WARNING: at net/ipv4/tcp_output.c:1807 tcp_simple_retransmit()
 [<c0104cb3>] show_trace_log_lvl+0x1a/0x2f
 [<c0105563>] show_trace+0x12/0x14
 [<c0105668>] dump_stack+0x15/0x17
 [<c02f6a79>] tcp_simple_retransmit+0xfa/0x185
 [<c02fa072>] tcp_v4_err+0x35d/0x4cb
 [<c0301f7d>] icmp_unreach+0x327/0x352
 [<c030159d>] icmp_rcv+0xe0/0xf7
 [<c02e2d75>] ip_local_deliver_finish+0x124/0x1ba
 [<c02e3178>] ip_local_deliver+0x72/0x7e
 [<c02e2c31>] ip_rcv_finish+0x299/0x2b9
 [<c02e30e8>] ip_rcv+0x1e1/0x1ff
 [<c02c755c>] netif_receive_skb+0x37d/0x401
 [<c02c9372>] process_backlog+0x5b/0x96
 [<c02c9037>] net_rx_action+0x87/0x152
 [<c0121c9f>] __do_softirq+0x38/0x7a
 [<c0105975>] do_softirq+0x41/0x92

> Have you run memtest recently?

Just ran it with no errors for 6 minutes 30. The box is otherwise stable though.

I forgot to say that I have a kdump image of the crash (I had to recompile this

2.6.24-rc2 kernel as I deleted its vmlinux), so I could check that you are
right on track with your assertions at the time of the crash.

> +    if (WARN_ON(tcp_write_queue_head(sk) == NULL))
> +        return;

(gdb) p sk->sk_write_queue.next
$11 = (struct sk_buff *) 0xe43a04b0
(gdb) p &sk->sk_write_queue
$12 = (struct sk_buff_head *) 0xe43a04b0


> +    if (WARN_ON(!tp->packets_out))
> +        return;

(gdb) p ((struct tcp_sock *) sk)->packets_out
$13 = 0

> +    if (tp->lost_out > tp->packets_out)
> +        printk(KERN_ERR "Lost underflowed to %u\n", tp->lost_out);

(gdb) p ((struct tcp_sock *) sk)->lost_out
$14 = 4294967295

Some more gdb output for information:

#0  tcp_xmit_retransmit_queue (sk=0xe43a0440) at net/ipv4/tcp_output.c:1962
1962                            __u8 sacked = TCP_SKB_CB(skb)->sacked;

(gdb) bt
#0  tcp_xmit_retransmit_queue (sk=0xe43a0440) at net/ipv4/tcp_output.c:1962
#1  0xc02f298a in tcp_ack (sk=0xe43a0440, skb=0xc75720c0, flag=1038) at net/ipv4/tcp_input.c:2524
#2  0xc02f5208 in tcp_rcv_established (sk=0xe43a0440, skb=0xc75720c0, th=0xeac35058, len=32) at net/ipv4/tcp_input.c:4502
#3  0xc02fa711 in tcp_v4_do_rcv (sk=0xe43a0440, skb=0xc75720c0) at net/ipv4/tcp_ipv4.c:1572
#4  0xc02fc557 in tcp_v4_rcv (skb=0xc75720c0) at net/ipv4/tcp_ipv4.c:1696
#5  0xc02e4961 in ip_local_deliver_finish (skb=0xc75720c0) at net/ipv4/ip_input.c:233
#6  0xc02e4d64 in ip_local_deliver (skb=0xc75720c0) at net/ipv4/ip_input.c:271
#7  0xc02e481d in ip_rcv_finish (skb=0xc75720c0) at include/net/dst.h:241
#8  0xc02e4cd4 in ip_rcv (skb=<value optimized out>, dev=0xc717c000, pt=<value optimized out>, orig_dev=0xc717c000) at net/ipv4/ip_input.c:445
#9  0xc02c9062 in netif_receive_skb (skb=0xc75720c0) at net/core/dev.c:2088
#10 0xc02cae8e in process_backlog (napi=0xc04b651c, quota=64) at net/core/dev.c:2125
#11 0xc02cab3f in net_rx_action (h=<value optimized out>) at net/core/dev.c:2195
#12 0xc0121d17 in __do_softirq () at kernel/softirq.c:232
#13 0xc0105975 in do_softirq () at arch/x86/kernel/irq_32.c:216
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

(gdb) bt full
#0  tcp_xmit_retransmit_queue (sk=0xe43a0440) at net/ipv4/tcp_output.c:1962
        sacked = 176 '�'
        skb = (struct sk_buff *) 0xe43a04b0
        packet_cnt = 0
#1  0xc02f298a in tcp_ack (sk=0xe43a0440, skb=0xc75720c0, flag=1038) at net/ipv4/tcp_input.c:2524
        packets_acked = 1
        sacked = 134 '\206'
        tp = <value optimized out>
        prior_snd_una = 2015065950
        ack_seq = 4044906321
        ack = 2015065959
        prior_in_flight = 2
        seq_rtt = -1
        frto_cwnd = <value optimized out>
#2  0xc02f5208 in tcp_rcv_established (sk=0xe43a0440, skb=0xc75720c0, th=0xeac35058, len=32) at net/ipv4/tcp_input.c:4502
        tcp_header_len = <value optimized out>
        tp = <value optimized out>
#3  0xc02fa711 in tcp_v4_do_rcv (sk=0xe43a0440, skb=0xc75720c0) at net/ipv4/tcp_ipv4.c:1572
        rsk = <value optimized out>
#4  0xc02fc557 in tcp_v4_rcv (skb=0xc75720c0) at net/ipv4/tcp_ipv4.c:1696
        err = -950591296
        filter = <value optimized out>
        iph = (const struct iphdr *) 0xeac35044
        th = (struct tcphdr *) 0xeac35058
        sk = (struct sock *) 0xe43a0440
        ret = <value optimized out>
#5  0xc02e4961 in ip_local_deliver_finish (skb=0xc75720c0) at net/ipv4/ip_input.c:233
        ret = <value optimized out>
        protocol = <value optimized out>
        hash = 0
        raw_sk = (struct sock *) 0x0
#6  0xc02e4d64 in ip_local_deliver (skb=0xc75720c0) at net/ipv4/ip_input.c:271
        __ret = -465959760
#7  0xc02e481d in ip_rcv_finish (skb=0xc75720c0) at include/net/dst.h:241
        iph = (const struct iphdr *) 0xeac35044
        rt = <value optimized out>
#8  0xc02e4cd4 in ip_rcv (skb=<value optimized out>, dev=0xc717c000, pt=<value optimized out>, orig_dev=0xc717c000) at net/ipv4/ip_input.c:445
        __ret = <value optimized out>
        iph = (struct iphdr *) 0xeac35044
        len = 3829007536
#9  0xc02c9062 in netif_receive_skb (skb=0xc75720c0) at net/core/dev.c:2088
        ptype = (struct packet_type *) 0xc0437a08
        pt_prev = <value optimized out>
        orig_dev = (struct net_device *) 0xc717c000
        ret = 1
        type = 8
#10 0xc02cae8e in process_backlog (napi=0xc04b651c, quota=64) at net/core/dev.c:2125
        skb = (struct sk_buff *) 0xe43a04b0
        dev = (struct net_device *) 0xc717c000
        work = 0
        start_time = 35819170
#11 0xc02cab3f in net_rx_action (h=<value optimized out>) at net/core/dev.c:2195
        n = (struct napi_struct *) 0xc04b651c
        work = 0
        weight = 64
        start_time = 35819170
        budget = 300
        have = (void *) 0x0
        __func__ = "net_rx_action"
        __warned = 0
#12 0xc0121d17 in __do_softirq () at kernel/softirq.c:232
        h = (struct softirq_action *) 0xc049e6b8
        pending = 1
        max_restart = 9
#13 0xc0105975 in do_softirq () at arch/x86/kernel/irq_32.c:216
        flags = 70
        irqctx = <value optimized out>
        isp = (u32 *) 0xc0439f14
        __func__ = "do_softirq"
        __warned = 0
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

(gdb) p *((struct tcp_sock *) sk)
$1 = {inet_conn = {icsk_inet = {sk = {__sk_common = {skc_family = 2, skc_state = 1 '\001', skc_reuse = 1 '\001', skc_bound_dev_if = 0, skc_node = {next = 0x0, pprev = 0xc65e0f38}, skc_bind_node = {next = 0xe604acd0, pprev = 0xe42fa890}, skc_refcnt = {counter = 3}, skc_hash = 2132787687, skc_prot = 0xc042ef40, skc_net = 0xc04b64a0}, sk_shutdown = 0 '\0', sk_no_check = 0 '\0', sk_userlocks = 0 '\0', sk_protocol = 6 '\006', sk_type = 1, sk_rcvbuf = 87380, sk_lock = {slock = {raw_lock = {<No data fields>}}, owned = 0, wq = {lock = {raw_lock = {<No data fields>}}, task_list = {next = 0xe43a0474, prev = 0xe43a0474}}}, sk_backlog = {head = 0x0, tail = 0x0}, sk_sleep = 0xc9be8d98, sk_dst_cache = 0xc86eb200, sk_policy = {0x0, 0x0}, sk_dst_lock = {raw_lock = {<No data fields>}}, sk_rmem_alloc = {counter = 0}, sk_wmem_alloc = {counter = 0}, sk_omem_alloc = {counter = 0}, sk_sndbuf = 35520, sk_receive_queue = {next = 0xe43a04a4, prev = 0xe43a04a4, qlen = 0, lock =
 {raw_lock = {<No data fields>}}}, sk_write_queue = {next = 0xe43a04b0, prev = 0xe43a04b0, qlen = 0, lock = {raw_lock = {<No data fields>}}}, sk_async_wait_queue = {next = 0x0, prev = 0x0, qlen = 0, lock = {raw_lock = {<No data fields>}}}, sk_wmem_queued = 0, sk_forward_alloc = 4096, sk_allocation = 208, sk_route_caps = 0, sk_gso_type = 1, sk_rcvlowat = 1, sk_flags = 17152, sk_lingertime = 0, sk_error_queue = {next = 0xe43a04e8, prev = 0xe43a04e8, qlen = 0, lock = {raw_lock = {<No data fields>}}}, sk_prot_creator = 0xc042ef40, sk_callback_lock = {raw_lock = {<No data fields>}}, sk_err = 0, sk_err_soft = 0, sk_ack_backlog = 0, sk_max_ack_backlog = 50, sk_priority = 2, sk_peercred = {pid = 0, uid = 4294967295, gid = 4294967295}, sk_rcvtimeo = 2147483647, sk_sndtimeo = 2147483647, sk_filter = 0x0, sk_protinfo = 0x0, sk_timer = {entry = {next = 0x0, prev = 0xc049ef80}, expires = 34855119, function = 0xc02f8e64 <tcp_keepalive_timer>, data = 3829007424, base =
 0xc049e900}, sk_stamp = {tv64 = 3294967295}, sk_socket = 0xc9be8d80, sk_user_data = 0x0, sk_sndmsg_page = 0x0, sk_send_head = 0x0, sk_sndmsg_off = 0, sk_write_pending = 0, sk_security = 0x0, sk_state_change = 0xc02c2378 <sock_def_wakeup>, sk_data_ready = 0xc02c2b7c <sock_def_readable>, sk_write_space = 0xc02c6817 <sk_stream_write_space>, sk_error_report = 0xc02c2b12 <sock_def_error_report>, sk_backlog_rcv = 0xc02fa6e6 <tcp_v4_do_rcv>, sk_destruct = 0xc0306a97 <inet_sock_destruct>}, pinet6 = 0x0, daddr = 2282963090, rcv_saddr = 50374848, dport = 41928, num = 6881, saddr = 50374848, uc_ttl = -1, cmsg_flags = 0, opt = 0x0, sport = 57626, id = 11867, tos = 8 '\b', mc_ttl = 46 '.', pmtudisc = 1 '\001', recverr = 0 '\0', is_icsk = 1 '\001', freebind = 0 '\0', hdrincl = 0 '\0', mc_loop = 1 '\001', mc_index = 2, mc_addr = 0, mc_list = 0x0, cork = {flags = 0, fragsize = 0, opt = 0x0, rt = 0x0, length = 0, addr = 0, fl = {oif = 0, iif = 0, mark = 0, nl_u = {ip4_u
 = {daddr = 0, saddr = 0, tos = 0 '\0', scope = 0 '\0'}, ip6_u = {daddr = {in6_u = {u6_addr8 = {0 '\0' <repeats 16 times>}, u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, u6_addr32 = {0, 0, 0, 0}}}, saddr = {in6_u = {u6_addr8 = {0 '\0' <repeats 16 times>}, u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, u6_addr32 = {0, 0, 0, 0}}}, flowlabel = 0}, dn_u = {daddr = 0, saddr = 0, scope = 0 '\0'}}, proto = 0 '\0', flags = 0 '\0', uli_u = {ports = {sport = 0, dport = 0}, icmpt = {type = 0 '\0', code = 0 '\0'}, dnports = {sport = 0, dport = 0}, spi = 0, mht = {type = 0 '\0'}}, secid = 0}}}, icsk_accept_queue = {rskq_accept_head = 0x0, rskq_accept_tail = 0x0, syn_wait_lock = {raw_lock = {<No data fields>}}, rskq_defer_accept = 0 '\0', listen_opt = 0x0}, icsk_bind_hash = 0xc7137bd0, icsk_timeout = 35822107, icsk_retransmit_timer = {entry = {next = 0xe4302e94, prev = 0xc851c1d4}, expires = 35822107, function = 0xc02f91a8 <tcp_write_timer>, data = 3829007424, base = 0xc049e900},
 icsk_delack_timer = {entry = {next = 0x0, prev = 0x200200}, expires = 35816596, function = 0xc02f9026 <tcp_delack_timer>, data = 3829007424, base = 0xc049e900}, icsk_rto = 1860, icsk_pmtu_cookie = 1500, icsk_ca_ops = 0xc0430c20, icsk_af_ops = 0xc042ef00, icsk_sync_mss = 0xc02f59a9 <tcp_sync_mss>, icsk_ca_state = 3 '\003', icsk_retransmits = 0 '\0', icsk_pending = 0 '\0', icsk_backoff = 0 '\0', icsk_syn_retries = 0 '\0', icsk_probes_out = 0 '\0', icsk_ext_hdr_len = 0, icsk_ack = {pending = 0 '\0', quick = 14 '\016', pingpong = 0 '\0', blocked = 0 '\0', ato = 40, timeout = 35816596, lrcvtime = 35816556, last_seg_size = 0, rcv_mss = 1368}, icsk_mtup = {enabled = 0, search_high = 1420, search_low = 564, probe_size = 0}, icsk_ca_priv = {0, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 32, 0, 0, 0, 0}}, tcp_header_len = 32, xmit_size_goal = 1368, pred_flags = 2520649856, rcv_nxt = 4044906321, copied_seq = 4044906321, rcv_wup = 4044906321, snd_nxt = 2015065959, snd_una =
 2015065959, snd_sml = 2015065959, rcv_tstamp = 35819170, lsndtime = 35818339, ucopy = {prequeue = {next = 0xe43a06ec, prev = 0xe43a06ec, qlen = 0, lock = {raw_lock = {<No data fields>}}}, task = 0x0, iov = 0x0, memory = 0, len = 0}, snd_wl1 = 4044906321, snd_wnd = 64088, max_window = 64088, mss_cache = 1368, window_clamp = 64087, rcv_ssthresh = 64087, frto_highmark = 0, reordering = 3 '\003', frto_counter = 0 '\0', nonagle = 0 '\0', keepalive_probes = 0 '\0', srtt = 8775, mdev = 764, mdev_max = 200, rttvar = 764, rtt_seq = 2015065959, packets_out = 0, retrans_out = 0, rx_opt = {ts_recent_stamp = 1194667355, ts_recent = 106039901, rcv_tsval = 106039901, rcv_tsecr = 35818339, saw_tstamp = 1, tstamp_ok = 1, dsack = 0, wscale_ok = 1, sack_ok = 0, snd_wscale = 2, rcv_wscale = 7, eff_sacks = 0 '\0', num_sacks = 0 '\0', user_mss = 0, mss_clamp = 1380}, snd_ssthresh = 2, snd_cwnd = 2, snd_cwnd_cnt = 1, snd_cwnd_clamp = 4294967295, snd_cwnd_used = 0,
 snd_cwnd_stamp = 35819170, out_of_order_queue = {next = 0xe43a0774, prev = 0xe43a0774, qlen = 0, lock = {raw_lock = {<No data fields>}}}, rcv_wnd = 64128, write_seq = 2015065959, pushed_seq = 2015065959, duplicate_sack = {{start_seq = 0, end_seq = 0}}, selective_acks = {{start_seq = 0, end_seq = 0}, {start_seq = 0, end_seq = 0}, {start_seq = 0, end_seq = 0}, {start_seq = 0, end_seq = 0}}, recv_sack_cache = {{start_seq = 0, end_seq = 0}, {start_seq = 0, end_seq = 0}, {start_seq = 0, end_seq = 0}, {start_seq = 0, end_seq = 0}}, highest_sack = 0, lost_skb_hint = 0x0, scoreboard_skb_hint = 0x0, retransmit_skb_hint = 0x0, forward_skb_hint = 0x0, fastpath_skb_hint = 0x0, fastpath_cnt_hint = 0, lost_cnt_hint = 1, retransmit_cnt_hint = 0, lost_retrans_low = 2015065959, advmss = 1448, prior_ssthresh = 3, lost_out = 4294967295, sacked_out = 0, fackets_out = 0, high_seq = 2015065959, retrans_stamp = 35794455, undo_marker = 2015065959, undo_retrans = 0, urg_seq =
 0, urg_data = 0, urg_mode = 0 '\0', ecn_flags = 0 '\0', snd_up = 0, total_retrans = 143, bytes_acked = 0, keepalive_time = 0, keepalive_intvl = 0, linger2 = 0, last_synq_overflow = 0, tso_deferred = 0, rcv_rtt_est = {rtt = 15645, seq = 4044968664, time = 35713869}, rcvq_space = {space = 10944, seq = 4044906321, time = 35816556}, mtu_probe = {probe_seq_start = 0, probe_seq_end = 0}}

My naive attempt at understanding what's going on:

My oops starts with:
BUG: unable to handle kernel NULL pointer dereference at virtual address 00000045


gdb tells me the crash is in:
#0  tcp_xmit_retransmit_queue (sk=0xe43a0440) at net/ipv4/tcp_output.c:1962
1962                            __u8 sacked = TCP_SKB_CB(skb)->sacked;

(gdb) p ((struct tcp_skb_cb *)((struct sk_buff *)0)->cb)->sacked
Cannot access memory at address 0x45

A 0x45 offset is definitely a ->sacked on a null skb, but:

(gdb) p skb
$5 = (struct sk_buff *) 0xe43a04b0

which is sk->sk_write_queue so I don't understand why the tcp_for_write_queue_from made an iteration.

I don't know if gdb is playing tricks or if it's because I had to recompile the crashing kernel.

Thanks.

-- 
Guillaume



      _____________________________________________________________________________ 
Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists