lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4E366A1A.2090004@gmail.com>
Date:	Mon, 01 Aug 2011 16:55:54 +0800
From:	Liu Yuan <namei.unix@...il.com>
To:	"Michael S. Tsirkin" <mst@...hat.com>
CC:	Rusty Russell <rusty@...tcorp.com.au>, Avi Kivity <avi@...hat.com>,
	kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] vhost-blk: An in-kernel accelerator for virtio-blk

On 08/01/2011 04:12 PM, Michael S. Tsirkin wrote:
> On Mon, Aug 01, 2011 at 02:25:36PM +0800, Liu Yuan wrote:
>> On 07/28/2011 11:22 PM, Michael S. Tsirkin wrote:
>>> It would be nicer to reuse the worker infrastructure
>> >from vhost.c. In particular this one ignores cgroups that
>>> the owner belongs to if any.
>>> Does this one do anything that vhost.c doesn't?
>>>
>> The main idea I use a separated thread to handling completion is to
>> decouple the  request handling
>> and the request completion signalling. This might allow better
>> scalability in a IO intensive scenario,
> The code seems to have the vq mutex though, isn't that right?
> If so, it can't execute in parallel so it's a bit
> hard to see how this would help scalability.
>

Nope, V1 completion thread doesn't has any mutex, thus can run parallel 
with the vhost worker.
Anyway, I'll adopt completion code to the vhost worker, since it deals 
with other stuff like cgroup.

> I had a feeling other devices might want some other structure
> (not an fd) as a backend. Maybe that was wrong ...
> If so, pls do that, and #define VHOST_NET_SET_BACKEND VHOST_SET_BACKEND
> for compatibility.
>

Okay.

>>>>   static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
>>>>   			 struct iocb *iocb, bool compat)
>>>> diff --git a/fs/eventfd.c b/fs/eventfd.c
>>>> index d9a5917..6343bc9 100644
>>>> --- a/fs/eventfd.c
>>>> +++ b/fs/eventfd.c
>>>> @@ -406,6 +406,7 @@ struct file *eventfd_file_create(unsigned int count, int flags)
>>>>
>>>>   	return file;
>>>>   }
>>>> +EXPORT_SYMBOL_GPL(eventfd_file_create);
>>> You can avoid the need for this export if you pass
>>> the eventfd in from userspace.
>>>
>> Since eventfd used by completion code is internal and hiding it from
>> hw/vhost_blk.c would simplify
>> the configuration, I think this exporting is necessary and can get
>> rid of unnecessary FD management
>> in vhost-blk.c.
> Well this is a new kernel interface duplicating the functionality of the
> old one.  You'll have a hard time selling this idea upstream, I suspect.
> And I doubt it simplifies the code significantly.
> Further, you have a single vq for block, but net has two and
> we do want the flexibility of using a single eventfd for both,
> I think.
>
point taken.

>>>>   SYSCALL_DEFINE2(eventfd2, unsigned int, count, int, flags)
>>>>   {
>>>> diff --git a/include/linux/aio.h b/include/linux/aio.h
>>>> index 7a8db41..d63bc04 100644
>>>> --- a/include/linux/aio.h
>>>> +++ b/include/linux/aio.h
>>>> @@ -214,6 +214,37 @@ struct mm_struct;
>>>>   extern void exit_aio(struct mm_struct *mm);
>>>>   extern long do_io_submit(aio_context_t ctx_id, long nr,
>>>>   			 struct iocb __user *__user *iocbpp, bool compat);
>>>> +extern void __put_ioctx(struct kioctx *ctx);
>>>> +extern struct kioctx *ioctx_alloc(unsigned nr_events);
>>>> +extern struct kiocb *aio_get_req(struct kioctx *ctx);
>>>> +extern ssize_t aio_run_iocb(struct kiocb *iocb);
>>>> +extern int __aio_run_iocbs(struct kioctx *ctx);
>>>> +extern int read_events(struct kioctx *ctx,
>>>> +                        long min_nr, long nr,
>>>> +                        struct io_event __user *event,
>>>> +                        struct timespec __user *timeout);
>>>> +extern void io_destroy(struct kioctx *ioctx);
>>>> +extern ssize_t aio_setup_iocb(struct kiocb *kiocb, bool compat);
>>>> +extern void __put_ioctx(struct kioctx *ctx);
>>>> +
>>>> +static inline void get_ioctx(struct kioctx *kioctx)
>>>> +{
>>>> +        BUG_ON(atomic_read(&kioctx->users)<= 0);
>>>> +        atomic_inc(&kioctx->users);
>>>> +}
>>>> +
>>>> +static inline int try_get_ioctx(struct kioctx *kioctx)
>>>> +{
>>>> +        return atomic_inc_not_zero(&kioctx->users);
>>>> +}
>>>> +
>>>> +static inline void put_ioctx(struct kioctx *kioctx)
>>>> +{
>>>> +        BUG_ON(atomic_read(&kioctx->users)<= 0);
>>>> +        if (unlikely(atomic_dec_and_test(&kioctx->users)))
>>>> +                __put_ioctx(kioctx);
>>>> +}
>>>> +
>>>>   #else
>>>>   static inline ssize_t wait_on_sync_kiocb(struct kiocb *iocb) { return 0; }
>>>>   static inline int aio_put_req(struct kiocb *iocb) { return 0; }
>>>> -- 
>>>> 1.7.5.1
>> Other comments will be addressed in V2. Thanks
>>
>> Yuan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ