lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080813084931.GC5367@ff.dom.local>
Date:	Wed, 13 Aug 2008 08:49:31 +0000
From:	Jarek Poplawski <jarkao2@...il.com>
To:	Denys Fedoryshchenko <denys@...p.net.lb>
Cc:	netdev@...r.kernel.org
Subject: Re: NMI lockup, 2.6.26 release

On Wed, Aug 13, 2008 at 11:02:34AM +0300, Denys Fedoryshchenko wrote:
> As soon as kernel reboot themself, it won't hurt me much.
> With NMI watchdog i notice there was panic missing, so nmi_watchdog was 
> showing message and was not rebooting. It is fixed in next kernel and i patch 
> in my kernel - so i will not crash+freeze anymore i guess and will not need 
> to run to power switch at night.
> 
> It can be related to another problem (some corruption) which is not fixed yet, 
> so prefferably to show timer guys exact location of problem.
> 
> Maybe you can make some patch like:
> 
> +	if (q->next_watchdog < q->now || next_event <=
> +	     q->next_watchdog - PSCHED_TICKS_PER_SEC / (10 * HZ)) {
> +		qdisc_watchdog_schedule(&q->watchdog, next_event);
> +		q->next_watchdog = next_event;
> +	} else {
> something like BUG()
>          }
> ?

I don't think it's right: there could be probably some small time
differences between cpus on SMP or even some inaccuracy related to
hardware, but I don't think it's the right place or method to verify
this. And eg. re-scheduling with the same time shouldn't be wrong too.

Anyway, narrowing the problem with such tests should give us better
understanding what could be a real problem here. BTW, could you
"remind" us the .config on this box (especially various *HZ*, *TIME*
and *TIMERS* settings).

> Probably also i will try to migrate to "rc" versions of kernel to see if 
> problem still exist there, a lot of changes done there... is HTB corruption 
> problem tracked finally and completely? I seen some discussions about it 
> recently...

I doubt current rc versions are stable enough for any production. HTB
waits for one fix, but it's nothing critical if it didn't bothered you
until now. There could be still some problems around schedulers
generally, after last big changes.

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ