lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5609F377.2090905@tomt.net>
Date:	Tue, 29 Sep 2015 04:12:07 +0200
From:	"Andre Tomt (LKML)" <lkml@...t.net>
To:	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	linux-kernel@...r.kernel.org, Julian Anastasov <ja@....bg>,
	"David S. Miller" <davem@...emloft.net>
Cc:	stable@...r.kernel.org,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Stephen Hemminger <stephen@...workplumber.org>,
	holger.hoffstaette@...glemail.com
Subject: Re: [PATCH 4.1 125/159] net: call rcu_read_lock early in
 process_backlog

On 26. sep. 2015 22:56, Greg Kroah-Hartman wrote:
> 4.1-stable review patch.  If anyone has any objections, please let me know.
>
> ------------------
>
> From: Julian Anastasov <ja@....bg>
>
> [ Upstream commit 2c17d27c36dcce2b6bf689f41a46b9e909877c21 ]
>
> Incoming packet should be either in backlog queue or
> in RCU read-side section. Otherwise, the final sequence of
> flush_backlog() and synchronize_net() may miss packets
> that can run without device reference:
<snip>

Several of our 4.1.9-rc1 running systems are experiencing hangs 
requiring hardware/sysrq reset with this patch applied. Reverting it 
fixes the hangs completely.

4.2 includes this patch as well but I have no such problems there. 
4.2.2-rc1 works fine as well.

For now I think this patch should be reverted in 4.1.9.

The hangs have occured so far on Xen PV and KVM x86_64 virtual machines, 
they will hang completely within minutes or hours depending on the type 
of workload. The workloads are all fairly light, one running low traffic 
email/antispam, another running monitoring and metrics of ~5 hosts and 
one running a single terminal IRC client. All but the IRC one will hang 
within a few minutes of booting.

When they lock up they only respond to sysrq, with ttyS0/hvc0 not 
echoing anything typed in back, and are completely dead on the network. 
One system managed to report rcu stalls but no backtraces (I'll look 
over the debug config, if there is any interest).

My bare metal desktop has yet to be able to hit it, but it might be 
entirely down to a different type of workload.

Something missing in 4.1?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ