[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <496565D9.6040102@shaw.ca>
Date: Wed, 07 Jan 2009 20:32:57 -0600
From: Robert Hancock <hancockr@...w.ca>
To: linux-kernel@...r.kernel.org
Cc: thomas.dahlmann@....com, linux-kernel@...r.kernel.org
Subject: Re: amd5536udc interrupts bug
Vadim Lobanov wrote:
> Hello,
>
> From perusing the code and playing with the module, it seems to me that
> the amd5536udc driver's handling of interrupts is currently "bustificated".
>
> The long story:
>
> During the amd5536udc initialization sequence within udc_pci_probe(),
> the code attempts to request a shared irq for the device thusly:
> request_irq(pdev->irq, udc_irq, IRQF_SHARED, name, dev)
> where 'dev' is the internal state structure. By the time this call is made,
> the 'dev' structure is still not fully initialized and contains blank/zero
> data for many of the fields, in particular both 'dev->lock' and 'dev->regs'
> which are both clearly used within the udc_irq() handler. Those get
> initialized a bit later, namely inside the udc_probe() call at the bottom of
> udc_pci_probe().
>
> It is my understanding that a handler for a shared interrupt can be
> invoked at any time after the corresponding request_irq() call is made,
> simply because some other device on the same interrupt may already
> be active. This leaves us with a very noticeable race condition, where
> udc_irq() can be invoked with uninitialized 'dev' data.
Yes, your analysis appears correct.
>
> In particular, this effect is very noticeable when I try modprobing the
> amd5536udc driver on a kernel that is built with CONFIG_DEBUG_SHIRQ
> set. Given that the debug config option forces an invocation of the irq
> handler from within the request_irq() function, the immediate effect is a
> masterful kernel NULL-dereference OOPS within udc_irq().
>
> The simple fix may be to say that amd5536udc does not support shared
> irqs, and to remove the IRQF_SHARED flag from the request_irq() call. A
That will bust any other hardware that tries to share the interrupt. If
a driver requests the interrupt without IRQF_SHARED, nothing else can
request that interrupt line.
> more complicated fix may be to try to shuffle all the code around to
> make sure that 'dev' is fully initialized before we request the irq, but I
> don't understand the code well enough (yet) to comfortably do this.
Yeah, that's the proper fix.
>
> Comments? Thoughts?
>
> On a side note, it occurs to me that the CONFIG_DEBUG_SHIRQ option
> went into the kernel a bit less than two years ago, and that it exposes a
> very immediate and reproducible OOPS in this driver. Does this mean
> that noone uses the 5536 UDC functionality with any recent kernels?
> Should I be worried? :)
Presumably nobody uses it with CONFIG_DEBUG_SHIRQ, that option wouldn't
normally be used on non-debug kernels..
>
> -- Vadim Lobanov
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists