lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <560A661C.7000406@tomt.net>
Date:	Tue, 29 Sep 2015 12:21:16 +0200
From:	Andre Tomt <andre@...t.net>
To:	Julian Anastasov <ja@....bg>
Cc:	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	linux-kernel@...r.kernel.org,
	"David S. Miller" <davem@...emloft.net>, stable@...r.kernel.org,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Stephen Hemminger <stephen@...workplumber.org>,
	holger.hoffstaette@...glemail.com
Subject: Re: [PATCH 4.1 125/159] net: call rcu_read_lock early in
 process_backlog

On 29. sep. 2015 10:39, Andre Tomt (LKML) wrote:
> I just had another hang with it reverted on two different guests..
> However it took nearly 6 hours rather than the usual "few minutes" for
> these two. So now I'm a little unsure about my initial conclusions.
> 
> On 29. sep. 2015 09:40, Julian Anastasov wrote:
>> On Tue, 29 Sep 2015, Andre Tomt (LKML) wrote:
<snip>
>> 	They are 2 related patches, the first one is
>> [PATCH 4.1 124/159] net: do not process device backlog during unregistration
> 
> Would reverting this change anything outside device unregistration at all?
> 
>> 	But the problematic patch calls rcu_read_lock while
>> local IRQ is disabled (in process_backlog), this is something
>> that should be noted for the patch. I'll try to see what Xen does.
>> It would be useful to see .config and any kind of backtraces/stalls,
>> it will help also to other developers to catch the problem...
> 
> 4.1 and 4.2 configs attached
> 
> I'll see if I can get some more debugging options enabled and a fully
> mainline test kernel, I still got a few local patches for security/ and
> runtime modify_ldt switching lurking in here..

> So far the kernels have not produced any output, other than a RCU stall
> detected message without any backtrace or other information, and that
> was just one time out of a couple dozen hangs.

Just got the hang on a completely mainline 4.1.9-rc1, no additional
patches and no reverts. It happened fairly quickly again (<5 min), so
theres that.

I enabled CONFIG_RCU_CPU_STALL_INFO=y and disabled a bunch of non-virt
drivers to speed up debugging. But no output this time either. Got any
ideas on debugging options I've forgot? Useful sysrqs?

Meanwhile I'll revert both the mentioned net patches and see how it goes.

View attachment "config-4.1.9-rc1" of type "text/plain" (104778 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ