lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161020100801.GV1722@lahna.fi.intel.com>
Date:   Thu, 20 Oct 2016 13:08:01 +0300
From:   Mika Westerberg <mika.westerberg@...ux.intel.com>
To:     Christian Steiner <christian.steiner@...look.de>
Cc:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Linus Walleij <linus.walleij@...aro.org>
Subject: Re: pinctrl_cherryview: massive error messages (handle_bad_irq)

+LinusW

On Wed, Oct 19, 2016 at 08:40:25PM +0000, Christian Steiner wrote:
> On 19.10.2016 21:22, Mika Westerberg wrote:
> > On Wed, Oct 19, 2016 at 06:32:57PM +0000, Christian Steiner wrote:
> >> Hello,
> >>
> >> since Kernel 4.8 I see a lot of the following error messages after
> >> resuming from suspend to RAM:
> >>
> >>> irq 117, desc: ffff88017a61e600, depth: 1, count: 0, unhandled: 0
> >>> ->handle_irq():  ffffffff8109b613, handle_bad_irq+0x0/0x1e0
> >>> ->irq_data.chip(): ffffffffa0020180, chv_pinctrl_exit+0x2d84/0x12 [pinctrl_cherryview]
> >>> ->action():           (null)
> >>>    IRQ_NOPROBE set
> >>
> >> The messages appear a few thousand times and mess up the log. Despite
> >> the messages everything seems to work fine.
> >> I see the messages with Kernel 4.8.2 and 4.9-rc1. The do not appear with
> >> Kernel 4.7.8 or earlier.
> > 
> > We changed the driver not to mask all IRQs when it starts. See commit
> > bcb48cca23ec ("pinctrl: cherryview: Do not mask all interrupts in
> > probe").
> > 
> >> I have an Acer TravelMate B116-M laptop with an Intel Pentium N3700
> >> processor.
> >>
> >> Here are the full log messages of a suspend to RAM and resume cycle with
> >> Kernel 4.7.8:
> >>
> >>> [   22.605418] PM: Syncing filesystems ... done.
> >>> [   22.613640] PM: Preparing system for sleep (mem)
> >>> [   22.615108] Freezing user space processes ... (elapsed 0.001 seconds) done.
> >>> [   22.616930] Double checking all user space processes after OOM killer disable... (elapsed 0.000 seconds)
> >>> [   22.616956] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> >>> [   22.618611] PM: Suspending system (mem)
> >>> [   22.618640] Suspending console(s) (use no_console_suspend to debug)
> >>> [   22.619773] sd 0:0:0:0: [sda] Synchronizing SCSI cache
> >>> [   22.626173] sd 0:0:0:0: [sda] Stopping disk
> >>> [   23.670076] PM: suspend of devices complete after 1051.008 msecs
> >>> [   23.691058] PM: late suspend of devices complete after 20.971 msecs
> >>> [   23.692104] r8169 0000:03:00.0: System wakeup enabled by ACPI
> >>> [   23.693087] xhci_hcd 0000:00:14.0: System wakeup enabled by ACPI
> >>> [   23.711687] PM: noirq suspend of devices complete after 20.624 msecs
> >>> [   23.711726] ACPI: Preparing to enter system sleep state S3
> >>> [   23.712295] ACPI : EC: EC stopped
> >>> [   23.712297] PM: Saving platform NVS memory
> >>> [   23.712306] Disabling non-boot CPUs ...
> >>> [   23.712590] Cannot set affinity for irq 191
> >>> [   23.713647] smpboot: CPU 1 is now offline
> >>> [   23.752643] smpboot: CPU 2 is now offline
> >>> [   23.802930] smpboot: CPU 3 is now offline
> >>> [   23.831827] ACPI: Low-level resume complete
> >>> [   23.831938] ACPI : EC: EC started
> >>> [   23.831939] PM: Restoring platform NVS memory
> >>> [   23.832600] Enabling non-boot CPUs ...
> >>> [   23.882819] x86: Booting SMP configuration:
> >>> [   23.882821] smpboot: Booting Node 0 Processor 1 APIC 0x2
> >>> [   23.885308]  cache: parent cpu1 should not be sleeping
> >>> [   23.885757] CPU1 is up
> >>> [   23.932612] smpboot: Booting Node 0 Processor 2 APIC 0x4
> >>> [   23.935171]  cache: parent cpu2 should not be sleeping
> >>> [   23.935674] CPU2 is up
> >>> [   23.982749] smpboot: Booting Node 0 Processor 3 APIC 0x6
> >>> [   23.985324]  cache: parent cpu3 should not be sleeping
> >>> [   23.985840] CPU3 is up
> >>> [   23.986370] ACPI: Waking up from system sleep state S3
> >>> [   24.012307] xhci_hcd 0000:00:14.0: System wakeup disabled by ACPI
> >>> [   24.023417] PM: noirq resume of devices complete after 20.911 msecs
> >>> [   25.052523] PM: early resume of devices complete after 1029.024 msecs
> >>> [   25.053082] r8169 0000:03:00.0: System wakeup disabled by ACPI
> >>> [   25.053124] iwlwifi 0000:02:00.0: RF_KILL bit toggled to disable radio.
> >>> [   25.054222] rtc_cmos 00:04: System wakeup disabled by ACPI
> >>> [   25.055645] sd 0:0:0:0: [sda] Starting disk
> >>> [   25.075031] r8169 0000:03:00.0 eth0: link down
> >>> [   25.374179] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> >>> [   25.377616] ata1.00: configured for UDMA/133
> >>> [   26.025736] PM: resume of devices complete after 973.238 msecs
> >>> [   26.026109] PM: Finishing wakeup.
> >>> [   26.026111] Restarting tasks ... done.
> >>> [   27.767611] r8169 0000:03:00.0 eth0: link up
> >>
> >> And this is how it looks with Kernel 4.8.2:
> >>
> >>> [   46.928231] PM: Syncing filesystems ... done.
> >>> [   46.930391] PM: Preparing system for sleep (mem)
> >>> [   46.932002] Freezing user space processes ... (elapsed 0.001 seconds) done.
> >>> [   46.933721] Double checking all user space processes after OOM killer disable... (elapsed 0.000 seconds)
> >>> [   46.933749] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> >>> [   46.935457] PM: Suspending system (mem)
> >>> [   46.935486] Suspending console(s) (use no_console_suspend to debug)
> >>> [   46.936492] sd 0:0:0:0: [sda] Synchronizing SCSI cache
> >>> [   46.936598] sd 0:0:0:0: [sda] Stopping disk
> >>> [   48.026353] PM: suspend of devices complete after 1090.448 msecs
> >>> [   48.057844] PM: late suspend of devices complete after 31.480 msecs
> >>> [   48.059086] r8169 0000:03:00.0: System wakeup enabled by ACPI
> >>> [   48.059240] xhci_hcd 0000:00:14.0: System wakeup enabled by ACPI
> >>> [   48.117234] PM: noirq suspend of devices complete after 59.386 msecs
> >>> [   48.117269] ACPI: Preparing to enter system sleep state S3
> >>> [   48.117836] ACPI : EC: EC stopped
> >>> [   48.117837] PM: Saving platform NVS memory
> >>> [   48.117846] Disabling non-boot CPUs ...
> >>> [   48.118130] Cannot set affinity for irq 191
> >>> [   48.119186] smpboot: CPU 1 is now offline
> >>> [   48.179126] smpboot: CPU 2 is now offline
> >>> [   48.229450] smpboot: CPU 3 is now offline
> >>> [   48.278338] ACPI: Low-level resume complete
> >>> [   48.278415] ACPI : EC: EC started
> >>> [   48.278416] PM: Restoring platform NVS memory
> >>> [   48.279036] irq 117, desc: ffff88017a61e600, depth: 1, count: 0, unhandled: 0
> >>> [   48.279044] ->handle_irq():  ffffffff8109b613, handle_bad_irq+0x0/0x1e0
> >>> [   48.279052] ->irq_data.chip(): ffffffffa0020180, chv_pinctrl_exit+0x2d84/0x12 [pinctrl_cherryview]
> >>> [   48.279054] ->action():           (null)
> >>> [   48.279055]    IRQ_NOPROBE set
> >>> [   48.279057] unexpected IRQ trap at vector 75
> >>>
> >> [...]
> >>>
> >>> [   48.659033] irq 117, desc: ffff88017a61e600, depth: 1, count: 0, unhandled: 0
> >>> [   48.659037] ->handle_irq():  ffffffff8109b613, handle_bad_irq+0x0/0x1e0
> >>> [   48.659041] ->irq_data.chip(): ffffffffa0020180, chv_pinctrl_exit+0x2d84/0x12 [pinctrl_cherryview]
> >>> [   48.659042] ->action():           (null)
> >>> [   48.659043]    IRQ_NOPROBE set
> >>> [   49.698698] PM: early resume of devices complete after 1089.068 msecs
> >>> [   49.699601] r8169 0000:03:00.0: System wakeup disabled by ACPI
> >>> [   49.699633] iwlwifi 0000:02:00.0: RF_KILL bit toggled to disable radio.
> >>> [   49.708346] sd 0:0:0:0: [sda] Starting disk
> >>> [   49.708581] rtc_cmos 00:04: System wakeup disabled by ACPI
> >>> [   49.731492] r8169 0000:03:00.0 eth0: link down
> >>> [   50.039718] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> >>> [   50.041544] ata1.00: configured for UDMA/133
> >>> [   50.381833] PM: resume of devices complete after 683.151 msecs
> >>> [   50.385539] PM: Finishing wakeup.
> >>> [   50.385541] Restarting tasks ... done.
> >>
> >> Are the messages just informational or is this something serious that
> >> should be fixed?
> > 
> > Basically it means the machine has an interrupt line active an nothing
> > is handling it. It is not serious but rather annoying.
> 
> Thank you. Good to hear it is not critical. I would ignore the messages,
> but it is annoying that almost 4MB of log messages are generated on
> every resume after standby. Maybe it is possible to define a limit for
> the maximum amount of the same messages?

The kernel will disable that line after 100000 spuriuos interrupts but I
guess it is still rather annoying for the user.

> > I've seen this happening on proto hardware but it was fixed with
> > firmware update.
> 
> Currently I am using BIOS v1.22. Recently Acer released v1.23. I could
> test if the BIOS update solves the problem.

Please try that first.

If that does not work, then I we may be able to mask all interrupts up
to nirqs - 1 but I first would like to check if the BIOS upgrade helps.

> >> If you need more information or if I should test a patch, please let me
> >> know.
> > 
> > Can you send me output of /proc/interrupts and
> > /sys/kernel/debug/pinctrl/INT33FF:*/pins.
> 
> Of course:
> 
> > $ cat /proc/interrupts
> >             CPU0       CPU1       CPU2       CPU3
> >    0:       1388          0          0          0   IO-APIC    2-edge      timer
> >    1:         17          3          4          3   IO-APIC    1-edge      i8042
> >    8:          1          0          0          0   IO-APIC    8-edge      rtc0
> >    9:         28         23         22         23   IO-APIC    9-fasteoi   acpi
> >   12:         76         36         36         36   IO-APIC   12-edge      i8042
> >   18:          0          0          0          0   IO-APIC   18-fasteoi   i801_smbus
> >   45:         13         17          9         13   IO-APIC   45-fasteoi   mmc0
> >   47:          0          0          0          0   IO-APIC   47-fasteoi   mmc1
> >  115:       7816       7632       7694       7549   PCI-MSI 311296-edge      ahci[0000:00:13.0]
> >  116:      43793      44661      43670      44402   PCI-MSI 1572864-edge      eth0
> >  117:      10759          0          0          0  chv-gpio    0
> >  191:          0          0          0          0  chv-gpio   50  80860F14:01 cd
> >  196:         35         37         40         41   PCI-MSI 1048576-edge      iwlwifi
> >  197:      24359      22517      23054      22852   PCI-MSI 327680-edge      xhci_hcd
> >  198:     174515     168035     165519     167706   PCI-MSI 32768-edge      i915
> >  199:         20         18         18         16   PCI-MSI 442368-edge      snd_hda_intel:card0
> >  NMI:          0          0          0          0   Non-maskable interrupts
> >  LOC:     409748     405619     425239     440549   Local timer interrupts
> >  SPU:          0          0          0          0   Spurious interrupts
> >  PMI:          0          0          0          0   Performance monitoring interrupts
> >  IWI:          0          0          0          0   IRQ work interrupts
> >  RTR:          0          0          0          0   APIC ICR read retries
> >  RES:      74598      69435      68734      60007   Rescheduling interrupts
> >  CAL:      23842      28885      29873      26859   Function call interrupts
> >  TLB:      17407      23763      25060      21008   TLB shootdowns
> >  TRM:          0          0          0          0   Thermal event interrupts
> >  THR:          0          0          0          0   Threshold APIC interrupts
> >  MCE:          0          0          0          0   Machine check exceptions
> >  MCP:         45         44         44         44   Machine check polls
> >  ERR:          0
> >  MIS:          0
> >  PIN:          0          0          0          0   Posted-interrupt notification event
> >  PIW:          0          0          0          0   Posted-interrupt wakeup event
> 
> > $ cat /sys/kernel/debug/pinctrl/INT33FF:*/pin
> > registered pins: 24
> > pin 0 (PMU_SLP_S3_B) mode 1 ctrl0 0x00010001 ctrl1 0x04c00000
> > pin 1 (PMU_BATLOW_B) mode 1 ctrl0 0x00010001 ctrl1 0x04c00000
> > pin 2 (SUS_STAT_B) mode 1 ctrl0 0x00010001 ctrl1 0x04c00000
> > pin 3 (PMU_SLP_S0IX_B) mode 1 ctrl0 0x00010001 ctrl1 0x04c00000
> > pin 4 (PMU_AC_PRESENT) mode 1 ctrl0 0x00010001 ctrl1 0x04c00000
> > pin 5 (PMU_PLTRST_B) mode 1 ctrl0 0x00010001 ctrl1 0x04c00000
> > pin 6 (PMU_SUSCLK) mode 1 ctrl0 0x00010000 ctrl1 0x04c00000
> > pin 7 (PMU_SLP_LAN_B) mode 1 ctrl0 0x00010000 ctrl1 0x04c00000
> > pin 8 (PMU_PWRBTN_B) mode 1 ctrl0 0x00010001 ctrl1 0x04c00000
> > pin 9 (PMU_SLP_S4_B) mode 1 ctrl0 0x00010001 ctrl1 0x04c00000
> > pin 10 (PMU_WAKE_B) mode 1 ctrl0 0x00c10001 ctrl1 0x04c00000
> > pin 11 (PMU_WAKE_LAN_B) GPIO ctrl0 0x00018300 ctrl1 0x04c00000
> > pin 15 (MF_ISH_GPIO_3) mode 1 ctrl0 0x00010000 ctrl1 0x04c00000
> > pin 16 (MF_ISH_GPIO_7) GPIO ctrl0 0x00018102 ctrl1 0x04c00000
> > pin 17 (MF_ISH_I2C1_SCL) GPIO ctrl0 0x00018102 ctrl1 0x04c00000
> > pin 18 (MF_ISH_GPIO_1) mode 1 ctrl0 0x00010000 ctrl1 0x04c00000
> > pin 19 (MF_ISH_GPIO_5) GPIO ctrl0 0x10008201 ctrl1 0x04c00004

It most probably is this one. Active high and the line is high so this
is generating interrupts all the time if not masked.

> > pin 20 (MF_ISH_GPIO_9) GPIO ctrl0 0x00018102 ctrl1 0x04c00000
> > pin 21 (MF_ISH_GPIO_0) mode 1 ctrl0 0x00010000 ctrl1 0x04c00000
> > pin 22 (MF_ISH_GPIO_4) GPIO ctrl0 0x00018102 ctrl1 0x04c00000
> > pin 23 (MF_ISH_GPIO_8) GPIO ctrl0 0x00018102 ctrl1 0x04c00000
> > pin 24 (MF_ISH_GPIO_2) mode 1 ctrl0 0x00010000 ctrl1 0x04c00000
> > pin 25 (MF_ISH_GPIO_6) GPIO ctrl0 0x00008201 ctrl1 0x04c00002
> > pin 26 (MF_ISH_I2C1_SDA) GPIO ctrl0 0x00018102 ctrl1 0x04c00000
> > registered pins: 55
> > pin 0 (MF_PLT_CLK0) GPIO ctrl0 0x00008200 ctrl1 0x04c00000
> > pin 1 (PWM1) GPIO ctrl0 0x00018102 ctrl1 0x04c00000
> > pin 2 (MF_PLT_CLK1) GPIO ctrl0 0x00008200 ctrl1 0x04c00000
> > pin 3 (MF_PLT_CLK4) GPIO ctrl0 0x00008200 ctrl1 0x04c00000
> > pin 4 (MF_PLT_CLK3) GPIO ctrl0 0x00008200 ctrl1 0x04c00000
> > pin 5 (PWM0) GPIO ctrl0 0x00018102 ctrl1 0x04c00000
> > pin 6 (MF_PLT_CLK5) GPIO ctrl0 0x00008200 ctrl1 0x04c00000
> > pin 7 (MF_PLT_CLK2) GPIO ctrl0 0x00008201 ctrl1 0x04c00000
> > pin 15 (SDMMC2_D3_CD_B) mode 1 ctrl0 0x00010081 ctrl1 0x04c00000
> > pin 16 (SDMMC1_CLK) mode 1 ctrl0 0x00110080 ctrl1 0x04c00000
> > pin 17 (SDMMC1_D0) mode 1 ctrl0 0x00910081 ctrl1 0x04c00000
> > pin 18 (SDMMC2_D1) mode 1 ctrl0 0x00010081 ctrl1 0x04c00000
> > pin 19 (SDMMC2_CLK) mode 1 ctrl0 0x00010080 ctrl1 0x04c00000
> > pin 20 (SDMMC1_D2) mode 1 ctrl0 0x00910081 ctrl1 0x04c00000
> > pin 21 (SDMMC2_D2) mode 1 ctrl0 0x00010081 ctrl1 0x04c00000
> > pin 22 (SDMMC2_CMD) mode 1 ctrl0 0x00010081 ctrl1 0x04c00000
> > pin 23 (SDMMC1_CMD) mode 1 ctrl0 0x00910081 ctrl1 0x04c00000
> > pin 24 (SDMMC1_D1) mode 1 ctrl0 0x00910081 ctrl1 0x04c00000
> > pin 25 (SDMMC2_D0) mode 1 ctrl0 0x00010081 ctrl1 0x04c00000
> > pin 26 (SDMMC1_D3_CD_B) mode 1 ctrl0 0x00910081 ctrl1 0x04c00000
> > pin 30 (SDMMC3_D1) mode 1 ctrl0 0x00910080 ctrl1 0x04c00000
> > pin 31 (SDMMC3_CLK) mode 1 ctrl0 0x00010080 ctrl1 0x04c00000
> > pin 32 (SDMMC3_D3) mode 1 ctrl0 0x00910080 ctrl1 0x04c00000
> > pin 33 (SDMMC3_D2) mode 1 ctrl0 0x00910080 ctrl1 0x04c00000
> > pin 34 (SDMMC3_CMD) mode 1 ctrl0 0x00910080 ctrl1 0x04c00000
> > pin 35 (SDMMC3_D0) mode 1 ctrl0 0x00910080 ctrl1 0x04c00000
> > pin 45 (MF_LPC_AD2) mode 1 ctrl0 0x00c10001 ctrl1 0x04c00000
> > pin 46 (LPC_CLKRUNB) mode 1 ctrl0 0x00010001 ctrl1 0x04c00000
> > pin 47 (MF_LPC_AD0) mode 1 ctrl0 0x00c10001 ctrl1 0x04c00000
> > pin 48 (LPC_FRAMEB) mode 1 ctrl0 0x00010001 ctrl1 0x04c00000
> > pin 49 (MF_LPC_CLKOUT1) mode 1 ctrl0 0x00010001 ctrl1 0x04c00000
> > pin 50 (MF_LPC_AD3) mode 1 ctrl0 0x00c10001 ctrl1 0x04c00000
> > pin 51 (MF_LPC_CLKOUT0) mode 1 ctrl0 0x00010000 ctrl1 0x04c00000
> > pin 52 (MF_LPC_AD1) mode 1 ctrl0 0x00c10001 ctrl1 0x04c00000
> > pin 60 (SPI1_MISO) mode 1 ctrl0 0x00910301 ctrl1 0x05c00000
> > pin 61 (SPI1_CSO_B) GPIO ctrl0 0x00018100 ctrl1 0x04c00000
> > pin 62 (SPI1_CLK) GPIO ctrl0 0x00018100 ctrl1 0x04c00000
> > pin 63 (MMC1_D6) mode 1 ctrl0 0x00910081 ctrl1 0x04c00000
> > pin 64 (SPI1_MOSI) GPIO ctrl0 0x00018102 ctrl1 0x04c00000
> > pin 65 (MMC1_D5) mode 1 ctrl0 0x00910081 ctrl1 0x04c00000
> > pin 66 (SPI1_CS1_B) mode 1 ctrl0 0x00010001 ctrl1 0x04c00000
> > pin 67 (MMC1_D4_SD_WE) mode 1 ctrl0 0x00910081 ctrl1 0x04c00000
> > pin 68 (MMC1_D7) mode 1 ctrl0 0x00910081 ctrl1 0x04c00000
> > pin 69 (MMC1_RCLK) mode 1 ctrl0 0x00010081 ctrl1 0x04c00000
> > pin 75 (USB_OC1_B) mode 1 ctrl0 0x00010001 ctrl1 0x04c00000
> > pin 76 (PMU_RESETBUTTON_B) mode 1 ctrl0 0x00010001 ctrl1 0x04c00000
> > pin 77 (GPIO_ALERT) GPIO ctrl0 0x00018102 ctrl1 0x04c00000
> > pin 78 (SDMMC3_PWR_EN_B) mode 1 ctrl0 0x00110001 ctrl1 0x04c00000
> > pin 79 (ILB_SERIRQ) mode 1 ctrl0 0x00010001 ctrl1 0x04c00000
> > pin 80 (USB_OC0_B) mode 1 ctrl0 0x00010001 ctrl1 0x04c00000
> > pin 81 (SDMMC3_CD_B) GPIO ctrl0 0x00918201 ctrl1 0x04c00003
> > pin 82 (SPKR) mode 1 ctrl0 0x00910300 ctrl1 0x05c00000
> > pin 83 (SUSPWRDNACK) mode 1 ctrl0 0x00010000 ctrl1 0x04c00000
> > pin 84 (SPARE_PIN) mode 1 ctrl0 0x00110300 ctrl1 0x05c00000
> > pin 85 (SDMMC3_1P8_EN) mode 1 ctrl0 0x00010000 ctrl1 0x04c00000
> 
> Thanks,
> Christian
> 
> PS: Sorry for double-posting on the previous post. It was a mistake.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ