linux-kernel - Re: [PATCH 2/4] swiotlb: Add a new cc-swiotlb implementation for Confidential VMs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ccc21265-07aa-cd82-f679-4fee9c51df47@linux.alibaba.com>
Date:   Mon, 30 Jan 2023 21:45:22 +0800
From:   Guorui Yu <GuoRui.Yu@...ux.alibaba.com>
To:     Andi Kleen <ak@...ux.intel.com>, linux-kernel@...r.kernel.org,
        iommu@...ts.linux-foundation.org, konrad.wilk@...cle.com,
        linux-coco@...ts.linux.dev
Cc:     robin.murphy@....com
Subject: Re: [PATCH 2/4] swiotlb: Add a new cc-swiotlb implementation for
 Confidential VMs

Hi Andi,

在 2023/1/30 14:46, Andi Kleen 写道:
> 
>> I try to solve this problem by creating a new kernel thread, "kccd", 
>> to populate the TLB buffer in the backgroud.
>>
>> Specifically,
>> 1. A new kernel thread is created with the help of "arch_initcall", 
>> and this kthread is responsible for memory allocation and setting 
>> memory attributes (private or shared);
>> 2. The "swiotlb_tbl_map_single" routine only use the spin_lock 
>> protected TLB buffers pre-allocated by the kthread;
>>   a) which actually includes ONE memory allocation brought by xarray 
>> insertion "__xa_insert__".
> 
> That already seems dangerous with all the usual problems of memory 
> allocations in IO paths. Normally code at least uses a mempool to avoid 
> the worst dead lock potential.
> 

The __xa_insert__ is called with GFP_NOWAIT (GFP_ATOMIC & ~__GFP_HIGH), 
and I will try to dig into this to check if there is any chance to have 
the deadlock.

I also tried my best to test this piece of code, and no issue have been 
found in the case of a maximum of 700,000 IOPS.

Thanks for your advices from this point, since I have not notice such 
possibility.

>> 3. After each allocation, the water level of TLB resources will be 
>> checked. If the current TLB resources are found to be lower than the 
>> preset value (half of the watermark), the kthread will be awakened to 
>> fill them.
>> 4. The TLB buffer allocation in the kthread is batched to 
>> "(MAX_ORDER_NR_PAGES << PAGE_SHIFT)" to reduce the holding time of 
>> spin_lock and number of calls to set_memory_decrypted().
> 
> Okay, but does this guarantee that it will never run out of memory?
>
> It seems difficult to make such guarantees. What happens for example if 
> the background thread gets starved by something higher priority?
> 
No, this cannot guarantee we always have sufficient TLB caches, so we 
can also have a "No memory for cc-swiotlb buffer" warning.

But I want to emphasize that in this case, the current implementation is 
no worse than the legacy implementation. Moreover, dynamic TLB 
allocation is more suitable for situations where more disks/network 
devices will be hotplugged, in which case you cannot pre-set a 
reasonable value.

> Or if the allocators have such high bandwidth that they can overwhelm 
> any reasonable background thread.
> 
> -Andi
> 

Sincerely,
Guorui