lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3463581.AQ3EkyvJ7g@vostro.rjw.lan>
Date:	Thu, 30 May 2013 16:34:51 +0200
From:	"Rafael J. Wysocki" <rjw@...k.pl>
To:	Yinghai Lu <yinghai@...nel.org>
Cc:	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	ACPI Devel Maling List <linux-acpi@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Toshi Kani <toshi.kani@...com>
Subject: Re: System slow down from udev

[Adding CC to Toshi Kani just in case he has an idea.]

On Wednesday, May 29, 2013 06:55:33 PM Yinghai Lu wrote:
> On Wed, May 29, 2013 at 4:55 PM, Rafael J. Wysocki <rjw@...k.pl> wrote:
> > On Wednesday, May 29, 2013 03:49:38 PM Yinghai Lu wrote:
> >> On Wed, May 29, 2013 at 2:34 PM, Rafael J. Wysocki <rjw@...k.pl> wrote:
> >> > On Wednesday, May 29, 2013 01:13:46 PM Yinghai Lu wrote:
> >> >> On Wed, May 29, 2013 at 4:29 AM, Rafael J. Wysocki <rjw@...k.pl> wrote:
> >> >> > On your systems the processor driver is built-in.  Any chance to build it as
> >> >> > a module and see if that helps?
> >> >>
> >> >> it CONFIG_ACPI_PROCESSOR it not set in the config
> >> >> the boot get to normal speed.
> >> >
> >> > Well, if it is not set at all, there won't be problems with it. :-)
> >> >
> >> > I've tested my linux-next branch on OpenSUSE 11.3 both with the processor
> >> > driver built in and modular and I'm not able to reproduce the issue you're
> >> > seeing.
> >> >
> >> > Moreover, I'm not sure if user space is involved here at all, because the
> >> > problem triggers for you when all of the relevant kernel code is non-modular.
> >> >
> >> > With the processor driver enabled, when the slowdown happens, are the systems
> >> > usable enough to get some debug info out of them?
> >>
> >> please check the bootchart data.
> >>
> >> looks like it take 200s if no acpi_processor ...
> >> otherwise will take 800s or more.
> >
> > Well, something's fishy for sure.
> >
> > To my eyes it looks like we're getting lots of notifications related to the
> > processor driver and that generates a lot of workqueue load.
> >
> > Can you please get /proc/interrupts from both cases and the output of
> > "find /sys/firmware/acpi/interrupts/ -print -exec cat {} \;"?

Thanks for the info!

> sca05-0a818ce5:~/g5_acpi_driver # find /sys/firmware/acpi/interrupts/
> -print -exec cat {} \;
> /sys/firmware/acpi/interrupts/
> cat: /sys/firmware/acpi/interrupts/: Is a directory
> /sys/firmware/acpi/interrupts/sci
>        0
> /sys/firmware/acpi/interrupts/error
>        0
> /sys/firmware/acpi/interrupts/gpe00
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe01
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe02
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe03
>        0   disabled
> /sys/firmware/acpi/interrupts/gpe04
>        0   disabled
> /sys/firmware/acpi/interrupts/gpe05
>        0   disabled
> /sys/firmware/acpi/interrupts/gpe06
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe07
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe08
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe09
>        0   disabled
> /sys/firmware/acpi/interrupts/gpe10
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe11
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe12
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe13
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe14
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe15
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe16
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe0A
>        0   enabled
> /sys/firmware/acpi/interrupts/gpe17
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe0B
>        0   disabled
> /sys/firmware/acpi/interrupts/gpe18
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe0C
>        0   disabled
> /sys/firmware/acpi/interrupts/gpe19
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe0D
>        0   disabled
> /sys/firmware/acpi/interrupts/gpe0E
>        0   disabled
> /sys/firmware/acpi/interrupts/gpe20
>        0   disabled
> /sys/firmware/acpi/interrupts/gpe0F
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe21
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe22
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe23
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe24
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe25
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe26
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe1A
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe27
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe1B
>        0   disabled
> /sys/firmware/acpi/interrupts/gpe28
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe1C
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe29
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe1D
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe1E
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe30
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe1F
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe31
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe32
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe33
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe34
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe35
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe36
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe2A
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe37
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe2B
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe38
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe2C
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe39
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe2D
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe2E
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe2F
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe3A
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe3B
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe3C
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe3D
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe3E
>        0   invalid
> /sys/firmware/acpi/interrupts/gpe3F
>        0   invalid
> /sys/firmware/acpi/interrupts/sci_not
>        0
> /sys/firmware/acpi/interrupts/ff_pmtimer
>        0   invalid
> /sys/firmware/acpi/interrupts/ff_rt_clk
>        0   disabled
> /sys/firmware/acpi/interrupts/gpe_all
>        0
> /sys/firmware/acpi/interrupts/ff_gbl_lock
>        0   enabled
> /sys/firmware/acpi/interrupts/ff_pwr_btn
>        0   enabled
> /sys/firmware/acpi/interrupts/ff_slp_btn
>        0   invalid

OK, no GPEs.  Interesting.

> > Also please send the output of "ls -l /sys/devices/system/cpu/cpu*" with the
> > processor driver present.
> 
> sca05-0a818ce5:~/g5_acpi_driver # ls -l /sys/devices/system/cpu/cpu*
> /sys/devices/system/cpu/cpu0:
> total 0
> drwxr-xr-x 6 root root    0 May 30 20:09 cache
> drwxr-xr-x 5 root root    0 May 30 20:09 cpuidle
> -r-------- 1 root root 4096 May 30 20:09 crash_notes
> -r-------- 1 root root 4096 May 30 20:09 crash_notes_size
> lrwxrwxrwx 1 root root    0 May 30 20:09 driver ->
> ../../../../bus/cpu/drivers/processor
> lrwxrwxrwx 1 root root    0 May 30 20:09 firmware_node ->
> ../../../LNXSYSTM:00/LNXCPU:00
> lrwxrwxrwx 1 root root    0 May 30 20:09 node0 -> ../../node/node0
> drwxr-xr-x 2 root root    0 May 30 20:09 power
> lrwxrwxrwx 1 root root    0 May 30 20:03 subsystem -> ../../../../bus/cpu
> drwxr-xr-x 2 root root    0 May 30 20:09 thermal_throttle
> drwxr-xr-x 2 root root    0 May 30 20:09 topology
> -rw-r--r-- 1 root root 4096 May 30 20:03 uevent
> 
> /sys/devices/system/cpu/cpu1:
> total 0
> drwxr-xr-x 6 root root    0 May 30 20:09 cache
> drwxr-xr-x 5 root root    0 May 30 20:09 cpuidle
> -r-------- 1 root root 4096 May 30 20:09 crash_notes
> -r-------- 1 root root 4096 May 30 20:09 crash_notes_size
> lrwxrwxrwx 1 root root    0 May 30 20:09 driver ->
> ../../../../bus/cpu/drivers/processor
> lrwxrwxrwx 1 root root    0 May 30 20:09 firmware_node ->
> ../../../LNXSYSTM:00/LNXCPU:01
> lrwxrwxrwx 1 root root    0 May 30 20:09 node0 -> ../../node/node0
> -rw-r--r-- 1 root root 4096 May 30 20:09 online
> drwxr-xr-x 2 root root    0 May 30 20:09 power
> lrwxrwxrwx 1 root root    0 May 30 20:03 subsystem -> ../../../../bus/cpu
> drwxr-xr-x 2 root root    0 May 30 20:09 thermal_throttle
> drwxr-xr-x 2 root root    0 May 30 20:09 topology
> -rw-r--r-- 1 root root 4096 May 30 20:03 uevent

[Skipping a number of analogous items.]

Well, it seems to initialize correctly at least.

> /sys/devices/system/cpu/cpuidle:
> total 0
> -r--r--r-- 1 root root 4096 May 30 20:10 current_driver
> -r--r--r-- 1 root root 4096 May 30 20:10 current_governor_ro

Well, this shows that my previous suspicion regarding notifications wasn't
justified, as there are none of them, apparently.

Also the CPUs' directory structures in sysfs look correctly to me.  The
driver binds to the devices it is supposed to bind to and acpi_bind_one()
works as expected.  Hmm.

Let's see if thermal throttling is not going on.  Please send the output of:
$ find /sys/devices/system/cpu/ -name core_throttle_count -print -exec cat {} \;
$ find /sys/devices/system/cpu/ -name package_throttle_count -print -exec cat {} \;

from the affected systems.

I'll try to dig deeper locally in the meantime.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ