linux-kernel - Re: [RFC PATCH] vhost-blk: An in-kernel accelerator for virtio-blk

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4E3646E0.9050400@gmail.com>
Date:	Mon, 01 Aug 2011 14:25:36 +0800
From:	Liu Yuan <namei.unix@...il.com>
To:	"Michael S. Tsirkin" <mst@...hat.com>
CC:	Rusty Russell <rusty@...tcorp.com.au>, Avi Kivity <avi@...hat.com>,
	kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] vhost-blk: An in-kernel accelerator for virtio-blk

On 07/28/2011 11:22 PM, Michael S. Tsirkin wrote:
>
> It would be nicer to reuse the worker infrastructure
> from vhost.c. In particular this one ignores cgroups that
> the owner belongs to if any.
> Does this one do anything that vhost.c doesn't?
>

The main idea I use a separated thread to handling completion is to 
decouple the  request handling
and the request completion signalling. This might allow better 
scalability in a IO intensive scenario,
since I noted that virtio driver would allow sumbit as much as 128 
requests in one go.

Hmm, I have tried to make signalling thread into a function that is 
called as a vhost-worker's work.
I didn't see regression in my laptop with iodepth equalling 1,2,3. But 
requests handling and completion signalling may produce race in a high 
requests submitting rate. Anyway, I'll adopt it to work as a vhost
worker function in v2.

>
>> +	switch (ioctl) {
>> +		case VHOST_NET_SET_BACKEND:
>> +			if(copy_from_user(&backend, (void __user *)arg, sizeof backend))
>> +				break;
>> +			ret = blk_set_backend(blk,&backend);
>> +			break;
> Please create your own ioctl for this one.
>

How about change VHOST_NET_SET_BACKEND into VHOST_SET_BACKEND?

>>
>>   static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
>>   			 struct iocb *iocb, bool compat)
>> diff --git a/fs/eventfd.c b/fs/eventfd.c
>> index d9a5917..6343bc9 100644
>> --- a/fs/eventfd.c
>> +++ b/fs/eventfd.c
>> @@ -406,6 +406,7 @@ struct file *eventfd_file_create(unsigned int count, int flags)
>>
>>   	return file;
>>   }
>> +EXPORT_SYMBOL_GPL(eventfd_file_create);
> You can avoid the need for this export if you pass
> the eventfd in from userspace.
>

Since eventfd used by completion code is internal and hiding it from 
hw/vhost_blk.c would simplify
the configuration, I think this exporting is necessary and can get rid 
of unnecessary FD management
in vhost-blk.c.

>>
>>   SYSCALL_DEFINE2(eventfd2, unsigned int, count, int, flags)
>>   {
>> diff --git a/include/linux/aio.h b/include/linux/aio.h
>> index 7a8db41..d63bc04 100644
>> --- a/include/linux/aio.h
>> +++ b/include/linux/aio.h
>> @@ -214,6 +214,37 @@ struct mm_struct;
>>   extern void exit_aio(struct mm_struct *mm);
>>   extern long do_io_submit(aio_context_t ctx_id, long nr,
>>   			 struct iocb __user *__user *iocbpp, bool compat);
>> +extern void __put_ioctx(struct kioctx *ctx);
>> +extern struct kioctx *ioctx_alloc(unsigned nr_events);
>> +extern struct kiocb *aio_get_req(struct kioctx *ctx);
>> +extern ssize_t aio_run_iocb(struct kiocb *iocb);
>> +extern int __aio_run_iocbs(struct kioctx *ctx);
>> +extern int read_events(struct kioctx *ctx,
>> +                        long min_nr, long nr,
>> +                        struct io_event __user *event,
>> +                        struct timespec __user *timeout);
>> +extern void io_destroy(struct kioctx *ioctx);
>> +extern ssize_t aio_setup_iocb(struct kiocb *kiocb, bool compat);
>> +extern void __put_ioctx(struct kioctx *ctx);
>> +
>> +static inline void get_ioctx(struct kioctx *kioctx)
>> +{
>> +        BUG_ON(atomic_read(&kioctx->users)<= 0);
>> +        atomic_inc(&kioctx->users);
>> +}
>> +
>> +static inline int try_get_ioctx(struct kioctx *kioctx)
>> +{
>> +        return atomic_inc_not_zero(&kioctx->users);
>> +}
>> +
>> +static inline void put_ioctx(struct kioctx *kioctx)
>> +{
>> +        BUG_ON(atomic_read(&kioctx->users)<= 0);
>> +        if (unlikely(atomic_dec_and_test(&kioctx->users)))
>> +                __put_ioctx(kioctx);
>> +}
>> +
>>   #else
>>   static inline ssize_t wait_on_sync_kiocb(struct kiocb *iocb) { return 0; }
>>   static inline int aio_put_req(struct kiocb *iocb) { return 0; }
>> -- 
>> 1.7.5.1

Other comments will be addressed in V2. Thanks

Yuan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/