lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 8 Aug 2016 21:25:25 -0700
From:	John Stultz <john.stultz@...aro.org>
To:	Jon Hunter <jonathanh@...dia.com>
Cc:	Marc Zyngier <marc.zyngier@....com>,
	Thomas Gleixner <tglx@...utronix.de>,
	lkml <linux-kernel@...r.kernel.org>,
	Bjorn Andersson <bjorn.andersson@...aro.org>
Subject: Re: [Regression] "irqdomain: Don't set type when mapping an IRQ"
 breaks nexus7 gpio buttons

On Mon, Aug 8, 2016 at 2:31 AM, Jon Hunter <jonathanh@...dia.com> wrote:
>
> On 06/08/16 00:45, John Stultz wrote:
>> On Mon, Aug 1, 2016 at 3:26 AM, Jon Hunter <jonathanh@...dia.com> wrote:
>>> Hi John,
>>>
>>> On 30/07/16 05:39, John Stultz wrote:
>>>> Hey Jon,
>>>>   So after rebasing my nexus7 patch stack onto pre-4.8-rc1 tree, I
>>>> noticed the power/volume buttons stopped working.
>>>>
>>>> I did a manual rebased bisection and chased it down to your commit
>>>> 1e2a7d78499e ("irqdomain: Don't set type when mapping an IRQ").
>>>>
>>>> Reverting that patch makes things work again, so I wanted to see if
>>>> there was any debugging info I could provide to try to help narrow
>>>> down the problem here. (Sorry, I'd tinker myself with it some and try
>>>> to debug the issue, but after burning my friday night on this, I'm
>>>> eager to get away from the keyboard for the weekend).
>>>
>>> Before this commit bad IRQ type settings in device-tree were not getting
>>> reported and so failures to set the IRQ type were going unnoticed. It's
>>> most likely a bad IRQ type settings somewhere.
>>>
>>> As Thomas mentioned hopefully dmesg will shed a bit more light.
>>>
>>> Otherwise it can be worth looking at the ->irq_set_type() function for
>>> the irqchips in the path of the interrupt requested to see if any are
>>> failing. Looking at the nexus7 (assuming qcom variant), it looks like
>>> there are 3 irqchips in the path (pm8921 --> apq8064-pinctrl --> gic).
>>> The pm8xxx_irq_set_type() could return a failure when setting up the IRQ
>>> type and could be worth checking. It does not look like the set_type for
>>> the apq8064-pinctrl should ever fail (apart from calling BUG() which
>>> would be obvious). The gic can also return a failure for setting the
>>> type, but I did not see anything at first glance that looks incorrect in
>>> the dts.
>>>
>>> If we can narrow down irqchip, then hopefully it will be clearer.
>>
>> The pm_8xxx_irq_set_type doesn't seem to be failing as far as I can see..
>>
>> Looking at the patch that seems to cause the trouble, I narrowed it
>> down to just the following chunk:
>>
>> @@ -614,7 +615,11 @@ unsigned int irq_create_fwspec_mapping(struct
>> irq_fwspec *fwspec)
>>                  * it now and return the interrupt number.
>>                  */
>>                 if (irq_get_trigger_type(virq) == IRQ_TYPE_NONE) {
>> -                       irq_set_irq_type(virq, type);
>> +                       irq_data = irq_get_irq_data(virq);
>> +                       if (!irq_data)
>> +                               return 0;
>> +
>> +                       irqd_set_trigger_type(irq_data, type);
>>                         return virq;
>>                 }
>>
>> If I revert just that, it works again.
>>
>> I was worried we were hitting an early failure from !irq_data, but it
>> seems there's some subtle difference between irqd_set_trigger_type and
>> irq_set_type that makes the former break for me.
>
> Thanks this is good info and at the same time odd.
>
> I am guessing that it is failing above because the irq_data is not found
> for the irq?

So actually no. We usually call irqd_set_trigger_type() but something
still doesn't work.

Interestingly, just adding irq_set_irq_type(virq, type); to the top of
that block (leaving the rest of the code) also works.

> What is odd, is that the above sequence is only executed if a irq
> mapping exists and so really, AFAICT this should not happen. Ie. the irq
> descriptor should have been allocated for the mapping to exist. We
> should probably warn if this happens.
>
> Without reverting the above, can you add a print to show the
> domain->name, hwirq and virq information if !irq_data? That will confirm
> the domain for us.

So I put some printk info in (in either case since I'm never seeing
the !irq_data case happen):

[    1.514217] JDB: virq: 93  hwirq: 74 domain name: msmgpio
[    1.838342] JDB: virq: 25  hwirq: 6 domain name: msmgpio

Which is odd, looking at:

shell@flo:/ $ cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3
 16:       1159       1138       1332       1574     GIC-0  18 Edge
  gp_timer
 25:          0          0          0          0   msmgpio   6 Edge
  ekth3500
111:          6          0          0          0     GIC-0  51 Edge
  qcom_rpm_ack
112:          0          0          0          0     GIC-0  53 Edge
  qcom_rpm_err
113:          0          0          0          0     GIC-0  54 Edge
  qcom_rpm_wakeup
114:         48          0          0          0     GIC-0 132 Edge
  msm_otg, ci_hdrc_msm
115:        796          0          0          0     GIC-0 130 Level     bam_dma
116:          0          0          0          0     GIC-0 128 Level     bam_dma
117:          0          0          0          0     GIC-0 127 Level     bam_dma
118:       2627          0          0          0     GIC-0 136 Level
  mmci-pl18x (cmd)
119:         54          0          0          0     GIC-0 226 Level     i2c_qup
120:         21          0          0          0     GIC-0 183 Level     i2c_qup
122:          0          0          0          0     GIC-0 189 Level     i2c_qup
123:        202          0          0          0     GIC-0 190 Level
  msm_serial0
124:          0          0          0          0     GIC-0  70 Edge      smsm
125:          0          0          0          0     GIC-0 121 Edge      smsm
126:          0          0          0          0     GIC-0 236 Edge      smsm
127:          0          0          0          0     GIC-0 169 Edge      smsm
131:          0          0          0          0    pm8xxx 195 Edge
  Volume Up
165:          0          0          0          0    pm8xxx 229 Edge
  Volume Down
184:          0          0          0          0    pm8xxx  39 Edge
  pm8xxx_rtc_alarm
185:          0          0          0          0    pm8xxx  50 Edge
  pmic8xxx_pwrkey_release
186:          0          0          0          0    pm8xxx  51 Edge
  pmic8xxx_pwrkey_press
IPI0:          0          1          1          1  CPU wakeup interrupts
IPI1:          0          0          0          0  Timer broadcast interrupts
IPI2:        944        539       1015        529  Rescheduling interrupts
IPI3:          1          4          6          4  Function call interrupts
IPI4:          0          0          0          0  CPU stop interrupts
IPI5:          0          0          0          0  IRQ work interrupts
IPI6:          0          0          0          0  completion interrupts
Err:          0

Since 25 maps to the ekth3500 (touch panel, which is still working
fine), but 93/74 doesn't seem to map to anything, and the problematic
irqs are the volume keys 195/229 and power keys 50/51.

thanks
-john

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ