lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d3cd4427-58a3-417b-a409-81d31110faeb@linux.intel.com>
Date: Tue, 29 Jul 2025 10:08:53 +0800
From: Baolu Lu <baolu.lu@...ux.intel.com>
To: Yu Zhang <zhangyu1@...ux.microsoft.com>
Cc: Dave Hansen <dave.hansen@...el.com>, Jason Gunthorpe <jgg@...dia.com>,
 Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
 Robin Murphy <robin.murphy@....com>, Kevin Tian <kevin.tian@...el.com>,
 Jann Horn <jannh@...gle.com>, Vasant Hegde <vasant.hegde@....com>,
 Alistair Popple <apopple@...dia.com>, Peter Zijlstra <peterz@...radead.org>,
 Uladzislau Rezki <urezki@...il.com>,
 Jean-Philippe Brucker <jean-philippe@...aro.org>,
 Andy Lutomirski <luto@...nel.org>, "Tested-by : Yi Lai" <yi1.lai@...el.com>,
 iommu@...ts.linux.dev, security@...nel.org, linux-kernel@...r.kernel.org,
 stable@...r.kernel.org
Subject: Re: [PATCH v2 1/1] iommu/sva: Invalidate KVA range on kernel TLB
 flush

On 7/29/25 01:36, Yu Zhang wrote:
> On Thu, Jul 24, 2025 at 11:01:12AM +0800, Baolu Lu wrote:
>> On 7/11/25 16:17, Yu Zhang wrote:
>>> On Thu, Jul 10, 2025 at 08:26:06AM -0700, Dave Hansen wrote:
>>>> On 7/10/25 06:22, Jason Gunthorpe wrote:
>>>>>> Why does this matter? We flush the CPU TLB in a bunch of different ways,
>>>>>> _especially_ when it's being done for kernel mappings. For example,
>>>>>> __flush_tlb_all() is a non-ranged kernel flush which has a completely
>>>>>> parallel implementation with flush_tlb_kernel_range(). Call sites that
>>>>>> use_it_ are unaffected by the patch here.
>>>>>>
>>>>>> Basically, if we're only worried about vmalloc/vfree freeing page
>>>>>> tables, then this patch is OK. If the problem is bigger than that, then
>>>>>> we need a more comprehensive patch.
>>>>> I think we are worried about any place that frees page tables.
>>>> The two places that come to mind are the remove_memory() code and
>>>> __change_page_attr().
>>>>
>>>> The remove_memory() gunk is in arch/x86/mm/init_64.c. It has a few sites
>>>> that do flush_tlb_all(). Now that I'm looking at it, there look to be
>>>> some races between freeing page tables pages and flushing the TLB. But,
>>>> basically, if you stick to the sites in there that do flush_tlb_all()
>>>> after free_pagetable(), you should be good.
>>>>
>>>> As for the __change_page_attr() code, I think the only spot you need to
>>>> hit is cpa_collapse_large_pages() and maybe the one in
>>>> __split_large_page() as well.
>>>>
>>>> This is all disturbingly ad-hoc, though. The remove_memory() code needs
>>>> fixing and I'll probably go try to bring some order to the chaos in the
>>>> process of fixing it up. But that's a separate problem than this IOMMU fun.
>>>>
>>> Could we consider to split the flush_tlb_kernel_range() into 2 different
>>> versions:
>>> - the one which only flushes the CPU TLB
>>> - the one which flushes the CPU paging structure cache and then notifies
>>>     IOMMU to do the same(e.g., in pud_free_pmd_page()/pmd_free_pte_page())?
>>  From the perspective of an IOMMU, there is no need to split. IOMMU SVA
>> only allows the device to access user-space memory with user
>> permission. Access to kernel address space with privileged permission
>> is not allowed. Therefore, the IOMMU subsystem only needs a callback to
>> invalidate the paging structure cache.
> Thanks Baolu.
> 
> Indeed. That's why I was wondering if we could split flush_tlb_kernel_range()
> into 2 versions - one used only after a kernal virtual address range is
> unmapped, and another one used after a kernel paging structure is freed.
> Only the 2nd one needs to notify the IOMMU subsystem.

Yeah! That sounds better.

Thanks,
baolu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ