[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5609F377.2090905@tomt.net>
Date: Tue, 29 Sep 2015 04:12:07 +0200
From: "Andre Tomt (LKML)" <lkml@...t.net>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
linux-kernel@...r.kernel.org, Julian Anastasov <ja@....bg>,
"David S. Miller" <davem@...emloft.net>
Cc: stable@...r.kernel.org,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Stephen Hemminger <stephen@...workplumber.org>,
holger.hoffstaette@...glemail.com
Subject: Re: [PATCH 4.1 125/159] net: call rcu_read_lock early in
process_backlog
On 26. sep. 2015 22:56, Greg Kroah-Hartman wrote:
> 4.1-stable review patch. If anyone has any objections, please let me know.
>
> ------------------
>
> From: Julian Anastasov <ja@....bg>
>
> [ Upstream commit 2c17d27c36dcce2b6bf689f41a46b9e909877c21 ]
>
> Incoming packet should be either in backlog queue or
> in RCU read-side section. Otherwise, the final sequence of
> flush_backlog() and synchronize_net() may miss packets
> that can run without device reference:
<snip>
Several of our 4.1.9-rc1 running systems are experiencing hangs
requiring hardware/sysrq reset with this patch applied. Reverting it
fixes the hangs completely.
4.2 includes this patch as well but I have no such problems there.
4.2.2-rc1 works fine as well.
For now I think this patch should be reverted in 4.1.9.
The hangs have occured so far on Xen PV and KVM x86_64 virtual machines,
they will hang completely within minutes or hours depending on the type
of workload. The workloads are all fairly light, one running low traffic
email/antispam, another running monitoring and metrics of ~5 hosts and
one running a single terminal IRC client. All but the IRC one will hang
within a few minutes of booting.
When they lock up they only respond to sysrq, with ttyS0/hvc0 not
echoing anything typed in back, and are completely dead on the network.
One system managed to report rcu stalls but no backtraces (I'll look
over the debug config, if there is any interest).
My bare metal desktop has yet to be able to hit it, but it might be
entirely down to a different type of workload.
Something missing in 4.1?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists