lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1007161448330.13946@melkinpaasi.cs.helsinki.fi>
Date:	Fri, 16 Jul 2010 15:02:48 +0300 (EEST)
From:	"Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
To:	Lennart Schulte <lennart.schulte@...s.rwth-aachen.de>
cc:	Eric Dumazet <eric.dumazet@...il.com>, Tejun Heo <tj@...nel.org>,
	"David S. Miller" <davem@...emloft.net>,
	lkml <linux-kernel@...r.kernel.org>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"Fehrmann, Henning" <henning.fehrmann@....mpg.de>,
	Carsten Aulbert <carsten.aulbert@....mpg.de>
Subject: Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15

On Thu, 15 Jul 2010, Lennart Schulte wrote:

> Since tcp_xmit_retransmit_queue also gets skb == NULL I'm pretty sure it is
> the same bug.
> Up to now I only experienced the problem with ACK loss (without ACK loss the
> test ran about 30min without problems, with ACK loss it had paniced within
> 10min).
> The data sender only has a HTB queue for traffic shaping (set to 20 Mbit/s).
> The ACK loss is done by another router.
> The setup looks like this. This way it seems to be the most realistic.
> 
> o sender with HTB
> |
> |
> o netem queue for forward path delay
> |
> o netem queue for a queue limit
> |
> o netem queue for backward path delay
> |
> o netem queue for ACK loss
> |
> |
> o receiver with HTB
> 
> Perhaps now it is a little big clearer.

> > > [ 2754.413150] NULL head, pkts 0
> > > [ 2754.413156] Errors caught so far 1

Thanks for reporting the results.

Could you post the oops too or double check do the timestamps really match 
(and there wasn't more "Errors caught" prints in between)? Since this 
condition doesn't seem to crash the kernel as also send_head should be 
NULL, which saves the day here exiting the loop (unless send head would 
too be corrupt). ...However, I don't like too much anyway that we can end 
up into tcp_xmit_retransmit_queue loop with packets_out being zero and 
only send_head check side-effect causes proper action.

Besides, Tejun has also found that it's hint->next ptr which is NULL in 
his case so this won't solve his case anyway. Tejun, can you confirm 
whether it was retransmit_skb_hint->next being NULL on _entry time_ to 
tcp_xmit_retransmit_queue() or later on in the loop after the updates done 
by the loop itself to the hint (or that your testing didn't conclude 
either)?

-- 
 i.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ