lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	09 Jul 2008 13:53:32 -0700
From:	Philippe Troin <phil@...i.org>
To:	john stultz <johnstul@...ibm.com>
Cc:	linux-kernel@...r.kernel.org, macro@...ux-mips.org
Subject: Re: 2.6.25.9: system clocks works normally then speeds up 4x...

john stultz <johnstul@...ibm.com> writes:

> On Wed, 2008-07-09 at 13:01 -0700, Philippe Troin wrote:
> > "john stultz" <johnstul@...ibm.com> writes:
> > 
> > > On Wed, Jul 9, 2008 at 12:21 PM, Philippe Troin <phil@...i.org> wrote:
> > > >
> > > > Symptoms:
> > > >
> > > >  The system boots fine. Clock seems to run normally.
> > > >
> > > >  Then after a random amount of time (on the current boot, 3 days),
> > > >  clock starts to be running 2-4x faster (on the current boot, 4x).
> > > >
> > > >  I have tried booting with "nohz=off highres=off" but it does not
> > > >  help.
> > > 
> > > Could you provide the output from the following:
> > >    sudo cat /sys/devices/system/clocksource/clocksource0/*
> > 
> > Sure.
> > 
> > It is:
> > available: jiffies tsc 
> > current:   jiffies
> > 
> > > Did this issue occur with 2.6.24 or earlier kernels?
> > 
> > No.  It started with 2.6.25.
> > 
> > Interestingly:
> > 
> >   I've just modified the current clocksource to tsc and the clock went
> >   back to its normal speed.
> > 
> >   Then I reset the current clocksource to jiffies, and the clock went
> >   back to its (wrong) 4x speed.
> > 
> > So it looks like the kernel is counting jiffies 4x too fast.
> 
> When you're seeing the issue, can you do the following:
>   cat /proc/interrupts > interrupts
> 
>   <wait 10 seconds by your wristwatch> 
> 
>   cat /proc/interrupts >> interrupts
> 
> And send the results?

There you are:

           CPU0       CPU1
  0:        353          0   IO-APIC-edge      timer
  1:          0          8   IO-APIC-edge      i8042
  2:          0          0    XT-PIC-XT        cascade
  3:          0          2   IO-APIC-edge
  4:      32796         68   IO-APIC-edge      serial
  8:          1          0   IO-APIC-edge      rtc
 14:     665397      37592   IO-APIC-edge      pata_via
 15:          0          0   IO-APIC-edge      pata_via
 16:   11417314     784937   IO-APIC-fasteoi   ohci_hcd:usb2, aic7xxx,
  firewire_ohci
 17:   11695442    1165240   IO-APIC-fasteoi   ohci_hcd:usb3, eth1
 18:   14967468    1533627   IO-APIC-fasteoi   ehci_hcd:usb1, eth0
 19:    1526542     363432   IO-APIC-fasteoi   uhci_hcd:usb4, eth2
NMI:          0          0   Non-maskable interrupts
LOC:  546305845   33155722   Local timer interrupts
RES:    4502087    5460357   Rescheduling interrupts
CAL:     816244    3856944   function call interrupts
TLB:     604097    1266758   TLB shootdowns
TRM:          0          0   Thermal event interrupts
SPU:          0          0   Spurious interrupts
ERR:          0
MIS:          0

Roughly 10 seconds later:

           CPU0       CPU1
  0:        353          0   IO-APIC-edge      timer
  1:          0          8   IO-APIC-edge      i8042
  2:          0          0    XT-PIC-XT        cascade
  3:          0          2   IO-APIC-edge
  4:      32796         68   IO-APIC-edge      serial
  8:          1          0   IO-APIC-edge      rtc
 14:     665481      37592   IO-APIC-edge      pata_via
 15:          0          0   IO-APIC-edge      pata_via
 16:   11417335     784937   IO-APIC-fasteoi   ohci_hcd:usb2, aic7xxx,
  firewire_ohci
 17:   11695614    1165240   IO-APIC-fasteoi   ohci_hcd:usb3, eth1
 18:   14967672    1533627   IO-APIC-fasteoi   ehci_hcd:usb1, eth0
 19:    1526542     363432   IO-APIC-fasteoi   uhci_hcd:usb4, eth2
NMI:          0          0   Non-maskable interrupts
LOC:  546361653   33156517   Local timer interrupts
RES:    4502100    5460379   Rescheduling interrupts
CAL:     816244    3856944   function call interrupts
TLB:     604097    1266758   TLB shootdowns
TRM:          0          0   Thermal event interrupts
SPU:          0          0   Spurious interrupts
ERR:          0
MIS:          0
 
> Could you also try booting with noapic to see if that changes anything?

Sure.  This will mean I will lose the "wedged" system.  Is there
anything else that needs to be checked on it before I lose the broken
state?
Also keep in mind that the symptoms take a while to manifest
themselves (a few days typically).

Phil.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ