[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4c24f2d2-03fd-a6cb-f950-391f3f7837cb@intel.com>
Date: Tue, 17 Dec 2019 11:19:28 -0800
From: "Chen, Yian" <yian.chen@...el.com>
To: Barret Rhoden <brho@...gle.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>,
David Woodhouse <dwmw2@...radead.org>,
Joerg Roedel <joro@...tes.org>,
Sohil Mehta <sohil.mehta@...el.com>
Cc: linux-kernel@...r.kernel.org, iommu@...ts.linux-foundation.org,
x86@...nel.org
Subject: Re: [PATCH 1/3] iommu/vt-d: skip RMRR entries that fail the sanity
check
On 12/16/2019 11:35 AM, Barret Rhoden wrote:
> On 12/16/19 2:07 PM, Chen, Yian wrote:
>>
>>
>> On 12/11/2019 11:46 AM, Barret Rhoden wrote:
>>> RMRR entries describe memory regions that are DMA targets for devices
>>> outside the kernel's control.
>>>
>>> RMRR entries that fail the sanity check are pointing to regions of
>>> memory that the firmware did not tell the kernel are reserved or
>>> otherwise should not be used.
>>>
>>> Instead of aborting DMAR processing, this commit skips these RMRR
>>> entries. They will not be mapped into the IOMMU, but the IOMMU can
>>> still be utilized. If anything, when the IOMMU is on, those devices
>>> will not be able to clobber RAM that the kernel has allocated from
>>> those
>>> regions.
>>>
>>> Signed-off-by: Barret Rhoden <brho@...gle.com>
>>> ---
>>> drivers/iommu/intel-iommu.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>> index f168cd8ee570..f7e09244c9e4 100644
>>> --- a/drivers/iommu/intel-iommu.c
>>> +++ b/drivers/iommu/intel-iommu.c
>>> @@ -4316,7 +4316,7 @@ int __init dmar_parse_one_rmrr(struct
>>> acpi_dmar_header *header, void *arg)
>>> rmrr = (struct acpi_dmar_reserved_memory *)header;
>>> ret = arch_rmrr_sanity_check(rmrr);
>>> if (ret)
>>> - return ret;
>>> + return 0;
>>> rmrru = kzalloc(sizeof(*rmrru), GFP_KERNEL);
>>> if (!rmrru)
>> Parsing rmrr function should report the error to caller. The behavior
>> to response the error can be
>> chose by the caller in the calling stack, for example,
>> dmar_walk_remapping_entries().
>> A concern is that ignoring a detected firmware bug might have a
>> potential side impact though
>> it seemed safe for your case.
>
> That's a little difficult given the current code. Once we are in
> dmar_walk_remapping_entries(), the specific function (parse_one_rmrr)
> is called via callback:
>
> ret = cb->cb[iter->type](iter, cb->arg[iter->type]);
> if (ret)
> return ret;
>
> If there's an error of any sort, it aborts the walk. Handling the
> specific errors here is difficult, since we don't know what the errors
> mean to the specific callback. Is there some errno we can use that
> means "there was a problem, but it's not so bad that you have to
> abort, but I figured you ought to know"? Not that I think that's a
> good idea.
>
> The knowledge of whether or not a specific error is worth aborting all
> DMAR functionality is best known inside the specific callback. The
> only handling to do is print a warning and either skip it or abort.
>
> I think skipping the entry for a bad RMRR is better than aborting
> completely, though I understand if people don't like that. It's
> debatable. By aborting, we lose the ability to use the IOMMU at all,
> but we are still in a situation where the devices using the RMRR
> regions might be clobbering kernel memory, right? Using the IOMMU
> (with no mappings for the bad RMRRs) would stop those devices from
> clobbering memory.
>
> Regardless, I have two other patches in this series that could resolve
> the problem for me and probably other people. I'd just like at least
> one of the three patches to get merged so that my machine boots when
> the original commit f036c7fa0ab6 ("iommu/vt-d: Check VT-d RMRR region
> in BIOS is reported as reserved") gets released.
>
when a firmware bug appears, the potential problem may beyond the scope
of its visible impacts so that introducing a workaround in official
implementation should be considered very carefully.
If the workaround is really needed at this point, I would recommend
adding a WARN_TAINT with TAINT_FIRMWARE_WORKAROUND, to tell the
workaround is in the place.
Thanks
Yian
> Thanks,
>
> Barret
>
Powered by blists - more mailing lists