lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 10 Aug 2021 11:02:14 +0800 From: Jason Wang <jasowang@...hat.com> To: Yongji Xie <xieyongji@...edance.com>, Robin Murphy <robin.murphy@....com> Cc: kvm <kvm@...r.kernel.org>, "Michael S. Tsirkin" <mst@...hat.com>, virtualization <virtualization@...ts.linux-foundation.org>, Christian Brauner <christian.brauner@...onical.com>, Jonathan Corbet <corbet@....net>, Matthew Wilcox <willy@...radead.org>, Christoph Hellwig <hch@...radead.org>, Dan Carpenter <dan.carpenter@...cle.com>, Stefano Garzarella <sgarzare@...hat.com>, Liu Xiaodong <xiaodong.liu@...el.com>, Joe Perches <joe@...ches.com>, Al Viro <viro@...iv.linux.org.uk>, Stefan Hajnoczi <stefanha@...hat.com>, songmuchun@...edance.com, Jens Axboe <axboe@...nel.dk>, He Zhe <zhe.he@...driver.com>, Greg KH <gregkh@...uxfoundation.org>, Randy Dunlap <rdunlap@...radead.org>, linux-kernel <linux-kernel@...r.kernel.org>, iommu@...ts.linux-foundation.org, bcrl@...ck.org, netdev@...r.kernel.org, linux-fsdevel@...r.kernel.org, Mika Penttilä <mika.penttila@...tfour.com> Subject: Re: [PATCH v10 01/17] iova: Export alloc_iova_fast() and free_iova_fast() 在 2021/8/9 下午1:56, Yongji Xie 写道: > On Thu, Aug 5, 2021 at 9:31 PM Jason Wang <jasowang@...hat.com> wrote: >> >> 在 2021/8/5 下午8:34, Yongji Xie 写道: >>>> My main point, though, is that if you've already got something else >>>> keeping track of the actual addresses, then the way you're using an >>>> iova_domain appears to be something you could do with a trivial bitmap >>>> allocator. That's why I don't buy the efficiency argument. The main >>>> design points of the IOVA allocator are to manage large address spaces >>>> while trying to maximise spatial locality to minimise the underlying >>>> pagetable usage, and allocating with a flexible limit to support >>>> multiple devices with different addressing capabilities in the same >>>> address space. If none of those aspects are relevant to the use-case - >>>> which AFAICS appears to be true here - then as a general-purpose >>>> resource allocator it's rubbish and has an unreasonably massive memory >>>> overhead and there are many, many better choices. >>>> >>> OK, I get your point. Actually we used the genpool allocator in the >>> early version. Maybe we can fall back to using it. >> >> I think maybe you can share some perf numbers to see how much >> alloc_iova_fast() can help. >> > I did some fio tests[1] with a ram-backend vduse block device[2]. > > Following are some performance data: > > numjobs=1 numjobs=2 numjobs=4 numjobs=8 > iova_alloc_fast 145k iops 265k iops 514k iops 758k iops > > iova_alloc 137k iops 170k iops 128k iops 113k iops > > gen_pool_alloc 143k iops 270k iops 458k iops 521k iops > > The iova_alloc_fast() has the best performance since we always hit the > per-cpu cache. Regardless of the per-cpu cache, the genpool allocator > should be better than the iova allocator. I think we see convincing numbers for using iova_alloc_fast() than the gen_poll_alloc() (45% improvement on job=8). Thanks > > [1] fio jobfile: > > [global] > rw=randread > direct=1 > ioengine=libaio > iodepth=16 > time_based=1 > runtime=60s > group_reporting > bs=4k > filename=/dev/vda > [job] > numjobs=.. > > [2] $ qemu-storage-daemon \ > --chardev socket,id=charmonitor,path=/tmp/qmp.sock,server,nowait \ > --monitor chardev=charmonitor \ > --blockdev > driver=host_device,cache.direct=on,aio=native,filename=/dev/nullb0,node-name=disk0 > \ > --export type=vduse-blk,id=test,node-name=disk0,writable=on,name=vduse-null,num-queues=16,queue-size=128 > > The qemu-storage-daemon can be builded based on the repo: > https://github.com/bytedance/qemu/tree/vduse-test. > > Thanks, > Yongji >
Powered by blists - more mailing lists