[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <db26b1e3-bd7a-44ee-b458-6cb0fedf6662@linux.intel.com>
Date: Thu, 15 Jan 2026 11:26:50 +0800
From: Baolu Lu <baolu.lu@...ux.intel.com>
To: "Tian, Kevin" <kevin.tian@...el.com>, Joerg Roedel <joro@...tes.org>,
Will Deacon <will@...nel.org>, Robin Murphy <robin.murphy@....com>,
Jason Gunthorpe <jgg@...dia.com>
Cc: Dmytro Maluka <dmaluka@...omium.org>,
Samiullah Khawaja <skhawaja@...gle.com>,
"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/3] iommu/vt-d: Use 128-bit atomic updates for context
entries
On 1/14/26 15:54, Tian, Kevin wrote:
>> From: Lu Baolu <baolu.lu@...ux.intel.com>
>> Sent: Tuesday, January 13, 2026 11:01 AM
>>
>> On Intel IOMMU, device context entries are accessed by hardware in
>> 128-bit chunks. Currently, the driver updates these entries by
>> programming the 'lo' and 'hi' 64-bit fields individually.
>>
>> This creates a potential race condition where the IOMMU hardware may
>> fetch
>> a context entry while the CPU has only completed one of the two 64-bit
>> writes. This "torn" entry — consisting of half-old and half-new data —
>> could lead to unpredictable hardware behavior, especially when
>> transitioning the 'Present' bit or changing translation types.
>
> this is not accurate. context entry is 128bits only. Scalable context
> entry is 256bits but only the lower 128bits are defined. so hardware always
> fetches the context entry atomically. Then if software ensures the right
> order of updates (clear present first then other bits), the hardware won't
> look at the partial entry after seeing present=0.
>
> But now as Dmytro reported there is no barrier in place so two 64bits
> updates to the context entry might be reordered so hw could fetch
> an entry with old lower half (present=1) and new higher half.
>
> then 128bit atomic operation avoids this ordering concern.
You're right. I will update the commit message to be more precise. Since
the hardware fetches the 128-bit context entry atomically, the issue is
essentially a software ordering problem.
We considered three approaches to solve this:
- Memory barriers (to enforce Present bit clearing order)
- WRITE_ONCE() (to prevent compiler reordering)
- 128-bit atomic updates
This patch uses the atomic update approach.
>
>> @@ -1170,19 +1170,19 @@ static int domain_context_mapping_one(struct
>> dmar_domain *domain,
>> goto out_unlock;
>>
>> copied_context_tear_down(iommu, context, bus, devfn);
>> - context_clear_entry(context);
>> - context_set_domain_id(context, did);
>> + context_set_domain_id(&new, did);
>
> I wonder whether it's necessary to use atomic in the attach path, from
> fix p.o.v.
>
> The assumption is that the context should have been cleared already
> before calling this function (and following ones). Does it make more
> sense to check the present bit, warning if set, then fail the operation?
> We could refactor them to do atomic update, but then it's for cleanup> instead of being part of a fix.
Yes. For the attach path, this is a cleanup rather than a fix.
>
> Then this may be split into three patches:
>
> - change context_clear_entry() to be atomic, to fix the teardown path
> - add present bit check in other functions in this patch, to scrutinize the
> attach path
> - change those functions to be atomic, as a clean up
Perhaps this also paves the way for enabling hitless replace in the
attach_dev path?
> Does it make sense?
Yes, it is.
Thanks,
baolu
Powered by blists - more mailing lists