lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171211100251.mzbynloyykoksbeq@phenom.ffwll.local>
Date:   Mon, 11 Dec 2017 11:02:51 +0100
From:   Daniel Vetter <daniel@...ll.ch>
To:     Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Intel Graphics Development <intel-gfx@...ts.freedesktop.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Ingo Molnar <mingo@...nel.org>
Subject: Re: [Intel-gfx] [GIT pull] x86 APIC updates for 4.15

On Wed, Dec 06, 2017 at 12:13:27PM +0100, Maarten Lankhorst wrote:
> Hey,
> 
> Op 30-11-17 om 23:47 schreef Thomas Gleixner:
> > On Thu, 30 Nov 2017, Maarten Lankhorst wrote:
> >> Op 30-11-17 om 10:18 schreef Thomas Gleixner:
> >> # cat /sys/kernel/debug/irq/irqs/28 
> >> handler:  handle_edge_irq
> >> device:   0000:00:02.0
> >> status:   0x00000000
> >> istate:   0x00000000
> >> ddepth:   0
> >> wdepth:   0
> >> dstate:   0x01401200
> >>             IRQD_ACTIVATED
> >>             IRQD_IRQ_STARTED
> >>             IRQD_SINGLE_TARGET
> >>             IRQD_AFFINITY_SET
> >> node:     0
> >> affinity: 4
> >> effectiv: 4
> >> pending:  
> >> domain:  PCI-MSI-2
> >>  hwirq:   0x8000
> >>  chip:    PCI-MSI
> >>   flags:   0x10
> >>              IRQCHIP_SKIP_SET_WAKE
> >>  parent:
> >>     domain:  VECTOR
> >>      hwirq:   0x1c
> >>      chip:    APIC
> >>       flags:   0x0
> > Ok. Fine. At that point the interrupt is assigned to CPU4
> >
> >> # cat /sys/kernel/debug/irq/irqs/28 
> >> handler:  handle_edge_irq
> >> device:   0000:00:02.0
> >> status:   0x00000400
> >>             _IRQ_NOPROBE
> >> istate:   0x00000000
> >> ddepth:   0
> >> wdepth:   0
> >> dstate:   0x01401300
> >>             IRQD_ACTIVATED
> >>             IRQD_IRQ_STARTED
> >>             IRQD_SINGLE_TARGET
> >>             IRQD_AFFINITY_SET
> >>             IRQD_SETAFFINITY_PENDING
> >> node:     0
> >> affinity: 0,5-7
> >> effectiv: 0
> >> pending:  4
> >> domain:  PCI-MSI-2
> >>  hwirq:   0x8000
> >>  chip:    PCI-MSI
> >>   flags:   0x10
> >>              IRQCHIP_SKIP_SET_WAKE
> >>  parent:
> >>     domain:  VECTOR
> >>      hwirq:   0x1c
> >>      chip:    APIC
> >>       flags:   0x0
> > Now after suspend/resume it's affine to CPU 0 and a request to move it back
> > to CPU4 is pending.
> >
> > That looks halfways sane, but:
> >
> >> affinity: 0,5-7
> >> effectiv: 0
> >> pending:  4
> > does not make sense because 4 is not in 0,5-7
> >
> > Can you please enable the tracepoints:
> >
> > /sys/kernel/debug/tracing/events/irq_vectors/vector*
> >
> > and
> >
> > /sys/kernel/debug/tracing/events/irq_matrix/*
> >
> > and collect the trace from
> >
> > /sys/kernel/debug/trace
> >
> > right after resume or maybe after the timeout hit.
> >
> > That might not give us all required info, but at least it will allow me to
> > come up with the necessary extras required. Right now I'm tapping in the
> > dark
> 
> The problem happens during hibernate, but seems might happen during s/r too, just not necessarily dying..
> Maybe a irq at the right moment is causing the unexpected "No irq handler for vector" ?
> 
> echo core > /sys/power/pm_test
> echo mem > /sys/power/state
> 
> cat /sys/kernel/debug/irq/irqs/28:
> affinity: 0,5-7
> effectiv: 0
> pending:  4
> 
> Same issue, just didn't cause the death yet.. interrupts still working as expected..
> 
> cat /sys/kernel/debug
>           <idle>-0     [000] d.h3   862.035000: irq_matrix_alloc: bit=33 cpu=1 online=1 avl=202 alloc=1 managed=0 online_maps=8 global_avl=1611, global_rsvd=15, total_alloc=13
>           <idle>-0     [000] d.h3   862.035003: vector_update: irq=30 vector=33 cpu=1 prev_vector=43 prev_cpu=0
>           <idle>-0     [000] d.h3   862.035004: vector_alloc: irq=30 vector=33 reserved=1 ret=0
>           <idle>-0     [000] d.h3   862.035005: vector_config: irq=30 vector=33 cpu=1 apicdest=0x00000002
>           <idle>-0     [000] d.h2   862.035065: vector_free_moved: irq=30 cpu=0 vector=43 is_managed=0
>           <idle>-0     [000] d.h2   862.035066: irq_matrix_free: bit=43 cpu=0 online=1 avl=196 alloc=7 managed=0 online_maps=8 global_avl=1612, global_rsvd=15, total_alloc=12
>    kworker/u16:2-149   [000] d..1   862.043955: vector_deactivate: irq=27 is_managed=0 can_reserve=1 early=0
>    kworker/u16:2-149   [000] d..2   862.043962: vector_clear: irq=27 vector=40 cpu=0 prev_vector=0 prev_cpu=0
>    kworker/u16:2-149   [000] d..2   862.043963: irq_matrix_free: bit=40 cpu=0 online=1 avl=197 alloc=6 managed=0 online_maps=8 global_avl=1613, global_rsvd=15, total_alloc=11
>    kworker/u16:2-149   [000] d..2   862.043964: irq_matrix_reserve: online_maps=8 global_avl=1613, global_rsvd=16, total_alloc=11
>    kworker/u16:2-149   [000] d..2   862.043964: vector_reserve: irq=27 ret=0
>    kworker/u16:2-149   [000] d..2   862.043965: vector_config: irq=27 vector=239 cpu=0 apicdest=0x00000001
>    kworker/u16:2-149   [000] d..1   862.044428: vector_teardown: irq=27 is_managed=0 has_reserved=1
>    kworker/u16:2-149   [000] d..1   862.044429: irq_matrix_remove_reserved: online_maps=8 global_avl=1613, global_rsvd=15, total_alloc=11
>    kworker/u16:5-1922  [001] d..1   862.049683: vector_deactivate: irq=30 is_managed=0 can_reserve=1 early=0
>    kworker/u16:5-1922  [001] d..2   862.049684: vector_clear: irq=30 vector=33 cpu=1 prev_vector=0 prev_cpu=0
>    kworker/u16:5-1922  [001] d..2   862.049684: irq_matrix_free: bit=33 cpu=1 online=1 avl=203 alloc=0 managed=0 online_maps=8 global_avl=1614, global_rsvd=15, total_alloc=10
>    kworker/u16:5-1922  [001] d..2   862.049685: irq_matrix_reserve: online_maps=8 global_avl=1614, global_rsvd=16, total_alloc=10
>    kworker/u16:5-1922  [001] d..2   862.049685: vector_reserve: irq=30 ret=0
>    kworker/u16:5-1922  [001] d..2   862.049686: vector_config: irq=30 vector=239 cpu=0 apicdest=0x00000001
>    kworker/u16:5-1922  [001] d..1   862.049711: vector_teardown: irq=30 is_managed=0 has_reserved=1
>    kworker/u16:5-1922  [001] d..1   862.049711: irq_matrix_remove_reserved: online_maps=8 global_avl=1614, global_rsvd=15, total_alloc=10
>           <idle>-0     [000] d.h3   862.094308: irq_matrix_alloc: bit=33 cpu=1 online=1 avl=202 alloc=1 managed=0 online_maps=8 global_avl=1613, global_rsvd=15, total_alloc=11
>           <idle>-0     [000] d.h3   862.094310: vector_update: irq=16 vector=33 cpu=1 prev_vector=37 prev_cpu=0
>           <idle>-0     [000] d.h3   862.094310: vector_alloc: irq=16 vector=33 reserved=1 ret=0
>           <idle>-0     [000] d.h3   862.094311: vector_config: irq=16 vector=33 cpu=1 apicdest=0x00000002
>           <idle>-0     [000] d.h2   862.094647: vector_free_moved: irq=16 cpu=0 vector=37 is_managed=0
>           <idle>-0     [000] d.h2   862.094648: irq_matrix_free: bit=37 cpu=0 online=1 avl=198 alloc=5 managed=0 online_maps=8 global_avl=1614, global_rsvd=15, total_alloc=10
>           <idle>-0     [001] d..2   867.272092: irq_matrix_online: online_maps=2 global_avl=396, global_rsvd=15, total_alloc=17
>           <idle>-0     [002] d..2   867.277213: irq_matrix_online: online_maps=3 global_avl=599, global_rsvd=15, total_alloc=17
>           <idle>-0     [003] d..2   867.281533: irq_matrix_online: online_maps=4 global_avl=802, global_rsvd=15, total_alloc=17
>           <idle>-0     [004] d..2   867.286371: irq_matrix_online: online_maps=5 global_avl=1005, global_rsvd=15, total_alloc=17
>           <idle>-0     [005] d..2   867.291078: irq_matrix_online: online_maps=6 global_avl=1208, global_rsvd=15, total_alloc=17
>           <idle>-0     [006] d..2   867.295910: irq_matrix_online: online_maps=7 global_avl=1411, global_rsvd=15, total_alloc=17
>           <idle>-0     [007] d..2   867.301120: irq_matrix_online: online_maps=8 global_avl=1614, global_rsvd=15, total_alloc=17
>   kworker/u16:24-1941  [006] d..1   867.335259: irq_matrix_reserve: online_maps=8 global_avl=1614, global_rsvd=16, total_alloc=17
>   kworker/u16:24-1941  [006] d..1   867.335262: vector_reserve: irq=27 ret=0
>   kworker/u16:24-1941  [006] d..1   867.335263: vector_config: irq=27 vector=239 cpu=0 apicdest=0x00000001
>   kworker/u16:24-1941  [006] ....   867.335265: vector_setup: irq=27 is_legacy=0 ret=0
>   kworker/u16:24-1941  [006] ....   867.335310: vector_activate: irq=27 is_managed=0 can_reserve=1 early=1
>   kworker/u16:24-1941  [006] d..1   867.335330: vector_config: irq=27 vector=239 cpu=0 apicdest=0x00000001
>   kworker/u16:76-1994  [004] d..1   867.335372: irq_matrix_reserve: online_maps=8 global_avl=1614, global_rsvd=17, total_alloc=17
>   kworker/u16:76-1994  [004] d..1   867.335373: vector_reserve: irq=30 ret=0
>   kworker/u16:76-1994  [004] d..1   867.335374: vector_config: irq=30 vector=239 cpu=0 apicdest=0x00000001
>   kworker/u16:76-1994  [004] ....   867.335375: vector_setup: irq=30 is_legacy=0 ret=0
>   kworker/u16:76-1994  [004] ....   867.335398: vector_activate: irq=30 is_managed=0 can_reserve=1 early=1
>   kworker/u16:76-1994  [004] d..1   867.335399: vector_config: irq=30 vector=239 cpu=0 apicdest=0x00000001
>   kworker/u16:76-1994  [004] d..1   867.335499: vector_activate: irq=30 is_managed=0 can_reserve=1 early=0
>   kworker/u16:76-1994  [004] d..2   867.335501: irq_matrix_alloc: bit=42 cpu=0 online=1 avl=192 alloc=11 managed=0 online_maps=8 global_avl=1613, global_rsvd=16, total_alloc=18
>   kworker/u16:76-1994  [004] d..2   867.335502: vector_update: irq=30 vector=42 cpu=0 prev_vector=0 prev_cpu=0
>   kworker/u16:76-1994  [004] d..2   867.335502: vector_alloc: irq=30 vector=42 reserved=1 ret=0
>   kworker/u16:76-1994  [004] d..2   867.335503: vector_config: irq=30 vector=42 cpu=0 apicdest=0x00000001
>   kworker/u16:76-1994  [004] d..3   867.335670: vector_config: irq=30 vector=42 cpu=0 apicdest=0x00000001
>   kworker/u16:24-1941  [006] d..1   867.335845: vector_activate: irq=27 is_managed=0 can_reserve=1 early=0
>   kworker/u16:24-1941  [006] d..2   867.335846: irq_matrix_alloc: bit=43 cpu=0 online=1 avl=191 alloc=12 managed=0 online_maps=8 global_avl=1612, global_rsvd=15, total_alloc=19
>   kworker/u16:24-1941  [006] d..2   867.335847: vector_update: irq=27 vector=43 cpu=0 prev_vector=0 prev_cpu=0
>   kworker/u16:24-1941  [006] d..2   867.335847: vector_alloc: irq=27 vector=43 reserved=1 ret=0
>   kworker/u16:24-1941  [006] d..2   867.335848: vector_config: irq=27 vector=43 cpu=0 apicdest=0x00000001
>   kworker/u16:24-1941  [006] d..3   867.335861: vector_config: irq=27 vector=43 cpu=0 apicdest=0x00000001
> 
> Hopefully this is enough?

Anything else we can do to move this? I just had to resolve a small
conflict when moving forward to -rc3. Carrying a revert for the entire
apic pull (too many deps to just revert the bisected patch) is a bit
annoying.

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ