[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2893ebc0-325f-46e2-8694-7e6263e2a66b@redhat.com>
Date: Fri, 26 Sep 2025 16:12:39 -0400
From: David Arcari <darcari@...hat.com>
To: alex@....works
Cc: linux-kernel@...r.kernel.org, Jerry Hoemann <jerry.hoemann@....com>,
alexander.shishkin@...ux.intel.com
Subject: Re: [PATCH] intel_th: core: fix null pointer dereference in
intel_th_irq
Hi Alex,
On 9/26/25 12:19 PM, alex@....works wrote:
> On 2025-08-25 20:45, David Arcari wrote:
>> In certain cases intel_th_irq can reference a null entry in
>> the th->thdev array. This results in the splat shown below.
>> The problem is that intel_th_output_enable() can modify the
>> thdev[] array at the same time intel_th_irq is referencing
>> the same array. This can be fixed by disabling interrupts
>> during the call to intel_th_output_enable().
>
> Hi David,
>
> Thank you for the bug report and rootcausing! Can you please also
> detail the sequence of actions by which this is reproduced, so
> that I can test my fix and not bother you with a back-and-forth
> over-email debugging and also add it to our regression testing?
> Doesn't have to be a shell script (although I wouldn't say no
> to that), plain english would work in a pinch. If you have the
> time, I'm also curious about your use case for intel_th.
Unfortunately, I don't have a great reproducer. I have a system which
is afflicted in ~ 1 out of every 100 reboots. Adding debug code made
the problem easier to reproduce.
>
> This has eluded our testing for about 10 years, so I'm very
> interested in the reproducer.
>
>> BUG: kernel NULL pointer dereference, address: 0000000000000304
>> Oops: Oops: 0000 [#1] SMP NOPTI
>> RIP: 0010:intel_th_irq+0x26/0x70 [intel_th]
>
> Yes, this is absolutely a bug.
>
>> @@ -715,7 +715,9 @@ intel_th_subdevice_alloc(struct intel_th *th,
>> int intel_th_output_enable(struct intel_th *th, unsigned int otype)
>> {
>> struct intel_th_device *thdev;
>> - int src = 0, dst = 0;
>> + int src = 0, dst = 0, ret = 0;
>> +
>> + disable_irq(th->irq);
>>
>> for (src = 0, dst = 0; dst <= th->num_thdevs; src++, dst++) {
>> for (; src < ARRAY_SIZE(intel_th_subdevices); src++) {
>
> [...]
>
>> @@ -750,16 +752,19 @@ int intel_th_output_enable(struct intel_th *th,
>> unsigned int otype)
>> goto found;
>> }
>>
>> +nodev:
>> + enable_irq(th->irq);
>> return -ENODEV;
>>
>> found:
>> thdev = intel_th_subdevice_alloc(th, &intel_th_subdevices[src]);
>> if (IS_ERR(thdev))
>> - return PTR_ERR(thdev);
>> -
>> - th->thdev[th->num_thdevs++] = thdev;
>> + ret = PTR_ERR(thdev);
>> + else
>> + th->thdev[th->num_thdevs++] = thdev;
>>
>> - return 0;
>> + enable_irq(th->irq);
>> + return ret;
>> }
>> EXPORT_SYMBOL_GPL(intel_th_output_enable);
>
> This is indeed a possible fix, but I believe a little bit of
> serialization can be employed here.
I was thinking there was a better approach. Given the situation if
you'd like me to test a fix, I can do so,
>
> Lastly, my apologies for tardiness.
No worries.
Best,
-DA
>
> Thanks!
> --
> Alex
>
Powered by blists - more mailing lists