linux-kernel - RE: [KERNEL BUG] do_timer/tick_handover_do

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <5C6899BCED92C94EBDCC00F80838E3D52113ABD3@SJEXCHMB06.corp.ad.broadcom.com>
Date:	Fri, 8 May 2015 05:21:30 +0000
From:	"Oza (Pawandeep) Oza" <oza@...adcom.com>
To:	Mike Galbraith <umgwanakikbuti@...il.com>
CC:	pawandeep oza <oza.contri.linux.kernel@...il.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	malayasen rout <malayasen.rout@...il.com>
Subject: RE: [KERNEL BUG] do_timer/tick_handover_do_timer 3.10.17

It seems odd to me to use BUG() for what you appear to be using it for..
not that I know exactly what that it mind you, but when you said when
some other gizmo in your box has a problem you crash the kernel, my head
tilted to the side - surely there's a more controlled response possible
than poking the big red self destruct button ;-)

Oza: 
We have to place red button as our last resort, if we don’t press we pass the time or miss the point where we can go back and debug.
So that is something by design.

Regards,
-Oza

-----Original Message-----
From: Mike Galbraith [mailto:umgwanakikbuti@...il.com] 
Sent: Friday, May 08, 2015 10:42 AM
To: Oza (Pawandeep) Oza
Cc: pawandeep oza; linux-kernel@...r.kernel.org; malayasen rout
Subject: Re: [KERNEL BUG] do_timer/tick_handover_do_timer 3.10.17

On Fri, 2015-05-08 at 04:16 +0000, Oza (Pawandeep) Oza wrote:
> So Mike, is this reason strong enough for you ?

Nope.  I think you did the right thing in removing your dependency on
jiffies reliability in a dying box.  You don't have to convince me of
anything though, CC timer subsystem maintainer, see what he says.

> I understand your point: solve the BUG, and I do tend to agree with you.
> 
> But by design and implementation, the BUG() is just a beginning of the end for dying kernel.
> And what happens in between this 'the beginning' and 'the end' is not less important. 
> (because say,  on our platform we want to get clean RAMDUMP to analyze what happened, and for that we want to get clean reboot)

I don't see anybody else having any trouble getting crash dumps.  I
spent yet another long day just yesterday, rummaging through one.

> Also,
> If somebody's design is to legally Crash the kernel (e.g. where kernel is actually not faulty).
> Then, I do expect that tick/timekeeping framework do its job as long as it can do, and it should do, because kernel is not faulty.
> But in this case it doesn’t handover jiffies incrementing job sanely.

It seems odd to me to use BUG() for what you appear to be using it for..
not that I know exactly what that it mind you, but when you said when
some other gizmo in your box has a problem you crash the kernel, my head
tilted to the side - surely there's a more controlled response possible
than poking the big red self destruct button ;-)

	-Mike