lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <FB8A4655DFD2B34DB16AE06DDDD6C0E231A71CE3@SJEXCHMB12.corp.ad.broadcom.com>
Date:	Wed, 5 Nov 2014 19:16:09 +0000
From:	"Charley (Hao Chuan) Chu" <charley.chu@...adcom.com>
To:	Cong Wang <cwang@...pensource.com>,
	Daniel Borkmann <borkmann@...earbox.net>
CC:	netdev <netdev@...r.kernel.org>
Subject: RE: Kernel Oops in __inet_twsk_kill()

Thanks Daniel and Cong,

The problem has been fixed. It is introduced by a third party patch, which decreases the refcnt of timewait socket. 

Charley

-----Original Message-----
From: Cong Wang [mailto:cwang@...pensource.com] 
Sent: Wednesday, November 05, 2014 10:00 AM
To: Daniel Borkmann
Cc: Charley (Hao Chuan) Chu; netdev
Subject: Re: Kernel Oops in __inet_twsk_kill()

On Wed, Nov 5, 2014 at 8:00 AM, Daniel Borkmann <borkmann@...earbox.net> wrote:
> [ moving to netdev ]
>
> -------- Original Message --------
> Subject: Kernel Oops in __inet_twsk_kill()
> Date: Tue, 4 Nov 2014 23:47:18 +0000
> From: Charley (Hao Chuan) Chu <charley.chu@...adcom.com>
> To: linux-kernel@...r.kernel.org <linux-kernel@...r.kernel.org>
>
> We have situation on our system. It brings the network interface up and down
> every
> a few seconds. Eventually, it brings down the system - the kernel crashed
> due to BUG
> on in __inet_twsk_kill(). The debug message show following call flow.
>
> 1) time-wait socket is created by tcp_time_wait() when the socket gets into
> "TIME_WAIT" state.
>     inet_twsk_alloc()               - refcnt= 0
>     inet_twsk_hashdance()  - refcnt = 3
>     inet_twsk_schedule()      - refcnt = 4
>     inet_twsk_put()                 - refcnt = 3
> 2) tcp_v4_timewait_ack() is called when sync is received
>     inet_twsk_put()                  - refcnt= 2      <== where we thing the
> problem is
>     occasionally, second sync is received, so the inet_twsk_put is called
> twice - refcnt = 1
> 3) twdr_do_twkill_work() is called when timed out
>     call __inet_twsk_kill - BUG_ON!!! as refcnt=2 (supposed to be 3).
>     call inet_twsk_put()
>
> In a normal case, the callflow only has step 1 and step 3.  Our
> understanding is
> the time-wait socket has three references - ehash, bhash and timer death
> row. In
> step 2, none of them are touched. Can anyone here explain to us why the
> inet_twsk_put()
> is called in tcp_v4_timewait_ack()?
>

It has been there for a rather long time, but this doesn't mean it is
correct. Its caller calls inet_twsk_put() on error path, so smells wrong
to call it on non-error path. But I don't look into this.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ