lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 21 Nov 2015 10:16:59 +0200
From:	Andrew <nitr0@...i.kr.ua>
To:	netdev@...r.kernel.org
Subject: Re: Kernel 4.1.12 crash

Memory corruption, if happens, IMHO shouldn't be a hardware-related - 
almost all of these boxes, except H61M-based box from 1st log, works for 
a long time with uptime more than year; and only software was changed on 
it; H61M-based box runs memtest86 for a tens of hours w/o any error. If 
it was caused by hardware - they should crash even earlier.

Rarely on different servers I saw 'zram decompression error' messages 
(in this case I've got such message on H61M-based box).

Also, other people that uses accel-ppp as BRAS software, have different 
kernel panics/bugs/oopses on fresh kernels.

I'll try to apply these patches, and I'll try to switch back to kernels 
that were stable on some boxes.

21.11.2015 01:13, Alexander Duyck пишет:
> On 11/20/2015 05:58 AM, Andrew wrote:
>> Hi all.
>>
>> Today some BRASes on 4.1.12 kernel were crashed.
>>
>> Here's crash traces: http://pastebin.com/p68hNS8R
>> http://pastebin.com/36ieRAM2 http://pastebin.com/3BRTVEB6
>>
>> On 3.2 kernel same hardware works OK, troubles were noticed after kernel
>> upgrade.
>>
>> What additional info is needed?
>
> Looking over the traces there seem to be two areas called out.
>
> The first is the fib_trie resize BUG_ON that was triggered due to the 
> parent and child not being associated.  I think that might be due to 
> memory corruption as I cannot find any spots where we are resizing 
> without correctly setting up the parent-child relationship of the 
> nodes first.
>
> The other spot that is showing up is ppp_shutdown_interface and it's 
> related path.  It looks like there are a couple of patches you could 
> try back-porting to see if it resolves the issue.  If they do then 
> perhaps they should be considered candidates for stable:
>
> 8cb775bc0a3 ("ppp: fix device unregistration upon netns deletion")
> 58a89ecaca5 ("ppp: fix lockdep splat in ppp_dev_uninit()")
>
> - Alex

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ