lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f4626354-b466-4fb7-9555-646877fd88d6@linux.intel.com>
Date: Mon, 24 Feb 2025 10:53:54 +0800
From: Ethan Zhao <haifeng.zhao@...ux.intel.com>
To: yunhui cui <cuiyunhui@...edance.com>
Cc: dwmw2@...radead.org, baolu.lu@...ux.intel.com, joro@...tes.org,
 will@...nel.org, robin.murphy@....com, iommu@...ts.linux.dev,
 linux-kernel@...r.kernel.org
Subject: Re: [External] Re: [PATCH] iommu/vt-d: fix system hang on reboot -f

在 2025/2/21 17:46, yunhui cui 写道:
> Hi Ethan,
>
> On Fri, Feb 21, 2025 at 4:40 PM Ethan Zhao <haifeng.zhao@...ux.intel.com> wrote:
>>
>> 在 2025/2/20 18:15, Yunhui Cui 写道:
>>> When entering intel_iommu_shutdown, system interrupts are disabled,
>> System interrupts were disabled ? you mean all interrupts were disabled
>> when entering intel_iommu_shutdown(), perhaps it is not true, at least
>> for upstream latest code.
>>
>>> and the reboot process might be scheduled out by down_write(). If the
>>> scheduled process does not yield (e.g., while(1)), the system will hang.
>> No NMI lockup watchdog jumping out here ?
> Steps to reproduce:
>
> 1. Avoid return in:
> if (no_iommu || dmar_disabled)
>      return;
>
> 2. Write a.out with while(1).
>
> 3. ./a.out &; reboot -f.
>
> 4. Observe. Send NMI via BIOS to check system response.
>
> 5. Add console=ttyS0,115200 to cmdline to increase reproduction chance.
>
> Let's continue discussing based on the above.

I will try these steps to reproduce.

Per the lastest upstream code, the local processor's interrupt mask is cleaned. so

the processor could accept interrupts and handle them. and lagacy interrupt should

be restored for later boot if there is lagacy device and as to NMI, no one could stop

it. In a short, perhaps it is fact under your hardware configureation that no interrupt

event come in to kick the scheduler to run when the a.out (while(1)) got scheduled in,

but not because all system interrupts are disabled.


Thanks,
Ethan

>> Thanks,
>> Ethan
>>
>>> Signed-off-by: Yunhui Cui <cuiyunhui@...edance.com>
>>> ---
>>>    drivers/iommu/intel/iommu.c | 3 ++-
>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>>> index cc46098f875b..76a1d83b46bf 100644
>>> --- a/drivers/iommu/intel/iommu.c
>>> +++ b/drivers/iommu/intel/iommu.c
>>> @@ -2871,7 +2871,8 @@ void intel_iommu_shutdown(void)
>>>        if (no_iommu || dmar_disabled)
>>>                return;
>>>
>>> -     down_write(&dmar_global_lock);
>>> +     if (!down_write_trylock(&dmar_global_lock))
>>> +             return;
>>>
>>>        /* Disable PMRs explicitly here. */
>>>        for_each_iommu(iommu, drhd)
>> --
>> "firm, enduring, strong, and long-lived"
>>
> Thanks,
> Yunhui
>
-- 
"firm, enduring, strong, and long-lived"


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ