linux-kernel - Re: [PATCH 2/4] swiotlb: Add a new cc-swiotlb implementation for Confidential VMs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <bdeef865-a326-75ce-a1d0-b5d0c5a44e14@linux.alibaba.com>
Date:   Wed, 1 Feb 2023 10:08:48 +0800
From:   Guorui Yu <GuoRui.Yu@...ux.alibaba.com>
To:     Andi Kleen <ak@...ux.intel.com>, linux-kernel@...r.kernel.org,
        iommu@...ts.linux-foundation.org, konrad.wilk@...cle.com,
        linux-coco@...ts.linux.dev
Cc:     robin.murphy@....com
Subject: Re: [PATCH 2/4] swiotlb: Add a new cc-swiotlb implementation for
 Confidential VMs

在 2023/2/1 01:16, Andi Kleen 写道:
>  >No, this cannot guarantee we always have sufficient TLB caches, so we 
> can also have a "No memory for cc-swiotlb buffer" warning.
> 
> It's not just a warning, it will be IO errors, right?
> 

Yes, they are IO errors, but unsustainable such IO errors are not fatal 
in my limited testing so far, and the system can survive after through 
them. Again, legacy swiotlb occasionally suffers from TLB starvation.

However, if dynamic allocation of TLB is not allowed at all, the system 
will be more likely to be overwhelmed by a large of bursting IOs and 
unable to respond. Such problems are generally transient, so it is 
difficult to reproduce and debug in a production environment. Users can 
only set an unreasonably large fixed size and REBOOT to mitigate this 
problem as much as possible.

>>
>> But I want to emphasize that in this case, the current implementation 
>> is no worse than the legacy implementation. Moreover, dynamic TLB 
>> allocation is more suitable for situations where more disks/network 
>> devices will be hotplugged, in which case you cannot pre-set a 
>> reasonable value.
> 
> That's a reasonable stand point, but have to emphasize that is 
> "probabilistic" in all the descriptions and comments.
>

Agreed, but one point to add is that the user can adjust the water level 
setting to reduce the possibility of interrupt context allocation TLB 
failure.

According to the current design, the kthread will be awaken to allocate 
new TLBs when it is lower than half of the water level, so more flexible 
room can be left by increasing the water level.

> I assume you did some stress testing (E.g. all cores submitting at full 
> bandwidth) to validate that it works for you?
>
> -Andi
> 

Yes, I tested by fio with different block sizes, iodepths and job 
numbers on my testbed.

And I have noticed that there are some "IO errors" of `No memory for 
cc-swiotlb buffer` in the beginning of the test, but it will be 
eventually disappeared as long as there are enough free memory.

Thanks for your time,
Guorui