lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200902202108.55332.philippe.grenard@laposte.net>
Date:	Fri, 20 Feb 2009 21:08:55 +0100
From:	Philippe Grenard <philippe.grenard@...oste.net>
To:	linux-kernel@...r.kernel.org
Cc:	"John Stoffel" <john@...ffel.org>
Subject: Re: PROBLEM: cannot get stable system since 2.6.28 kernel (amd64)

On Tuesday 17 February 2009 05:46:47 you wrote:
> >>>>> "Philippe" == Philippe Grenard <philippe.grenard@...oste.net> writes:
>
> Philippe> well, I don't really believe in hardware problem for two
> Philippe> reasons : 1st, nearly all my hardware is quite new ( < 1
> Philippe> year old ), which is, I agree, not a solid proof ;-) 2nd,
> Philippe> the problem is really repetitive : every time I boot on
> Philippe> older kernel, everything works like a charm, every time I
> Philippe> boot on the newer kernel, I end up crashing : the "random"
> Philippe> part is only the time before crash....
>
> Philippe> I haven't any "overclocking" settings, and every hardware
> Philippe> and bios settings are the same : same computer, same
> Philippe> harddisk partition, and so on.  2.6.28 will everytime stop
> Philippe> after "Booting the kernel".  2.6.29-rc* will boot, but then
> Philippe> stalls after a random delay... Except the "/proc/cpuinfo"
> Philippe> difference between the two kernels, I don't have a clue....
>
> Philippe> The thing is I can continue using the old kernel, but I
> Philippe> thought I better report this since It could hide some
> Philippe> regression on amd64 systems ?
>
> It's certainly sounding like a regression, or misconfiguration
> somewhere.  Can you start doing a 'git bisect' routine on this to see
> if you can find the commit which causes this regression?
>
> It will take around 10 or some recompiles and reboots, but should do
> the trick, or at least help narrow things down alot.  Read
> Documentation/BUG-HUNTING ofr how to do a bisect run.
>
> You could also go back and re-build your 2.6.28 kernel and config from
> scratch to confirm that it's really working properly.  Then, jump to
> 2.6.29-rc5 (just released) and do a 'make oldconfig' and see what
> you're prompted for.  If it still crashes, send us the original
> config, and a diff to the new config (diff -u) so we can look it over.
>
> Also, make sure you're not running *any* binary kernel modules, since
> that will make us completely ignore you.  We can't debug issues like
> this when nVidia graphics modules are in the system since we can't
> tell what that module does to the rest of the kernel and it's not
> worth ou time to figure out.
>
> I'm running 2.6.29-rc3 on my AMD64 system and it's been nice and
> stable and my hardware too is under a year old.
>
> John

Hello, and sorry for my late reply, i was really missing time these last 
days...

So I played a bit with the "git bisect" thing between working 2.6.27 and non-
working (not event booting) 2.6.28-rc1.
I at last get the following result :
dc1e35c6e95e8923cf1d3510438b63c600fee1e2 is first bad commit
commit dc1e35c6e95e8923cf1d3510438b63c600fee1e2
Author: Suresh Siddha <suresh.b.siddha@...el.com>
Date:   Tue Jul 29 10:29:19 2008 -0700

    x86, xsave: enable xsave/xrstor on cpus with xsave support

    Enables xsave/xrstor by turning on cr4.osxsave on cpu's which have
    the xsave support. For now, features that OS supports/enabled are
    FP and SSE.

    Signed-off-by: Suresh Siddha <suresh.b.siddha@...el.com>
    Signed-off-by: H. Peter Anvin <hpa@...or.com>
    Signed-off-by: Ingo Molnar <mingo@...e.hu>

:040000 040000 2da7423ef94db281ca28b8601a821079585bbc64 
f601bfcd48c1eea8241042615426ce4e59f33495 M      arch
:040000 040000 612bc82743393f2416f2a0e621fdc551e2ea039d 
2019464b8f6c1f2f608fd69384c6b3e466802150 M      include

A bit of googling gave me this :
http://lkml.org/lkml/2009/1/19/161 
which exactly corresponds to my problem with 2.6.28 !

A "solution" proposed was to try to check the bios setting "cpuid value limit" 
and disable it if enabled. This did help me to boot 2.6.28 or previously 
failing kernel compiled during git-bisect, but gave me the "random freeze" any 
way....

Some part of that thread are far from being clear to me...
Please also find attach my currently working config file (config-2.6.27.17) and 
the "diff -u" to the non-working 2.6.29-rc5.

Thank you again for your help,

Regards,

Philippe

View attachment "config-2.6.27.17" of type "text/plain" (61047 bytes)

View attachment "diff-u_config" of type "text/x-patch" (34957 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ