[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <68418036-ae16-b58c-71d8-bb177fb30b51@oracle.com>
Date: Mon, 17 Sep 2018 15:08:22 -0400
From: Boris Ostrovsky <boris.ostrovsky@...cle.com>
To: Dongli Zhang <dongli.zhang@...cle.com>,
xen-devel@...ts.xenproject.org, linux-kernel@...r.kernel.org
Cc: jgross@...e.com, paul.durrant@...rix.com, wei.liu2@...rix.com,
konrad.wilk@...cle.com, roger.pau@...rix.com,
srinivas.eeda@...cle.com
Subject: Re: [PATCH 1/6] xenbus: prepare data structures and parameter for
xenwatch multithreading
On 9/16/18 9:20 PM, Dongli Zhang wrote:
> Hi Boris,
>
> On 09/17/2018 04:17 AM, Boris Ostrovsky wrote:
>>
>> On 9/14/18 3:34 AM, Dongli Zhang wrote:
>>
>>> +
>>> +struct mtwatch_info {
>>> + /*
>>> + * The mtwatch_domain is put on both a hash table and a list.
>>> + * domain_list is used to optimize xenbus_watch un-registration.
>>> + *
>>> + * The mtwatch_domain is removed from domain_hash (with state set
>>> + * to MTWATCH_DOMAIN_DOWN) when its refcnt is zero. However, it is
>>> + * left on domain_list until all events belong to such
>>> + * mtwatch_domain are processed in mtwatch_thread().
>>
>> Do we really need to keep mwatch_domain on both lists? Why is keeping it on,
>> say, only the hash not sufficient?
> In the state of the art xenbus, when a watch is unregistered (e.g.,
> unregister_xenbus_watch()), we need to traverse the list 'watch_events' to
> remove all inflight/pending events (for such watch) from 'watch_events'.
>
> About this patch set, as each domain would have its own event list, we need to
> traverse the list of each domain to remove the pending events for the watch to
> be unregistered.
>
> E.g.,
> unregister_xenbus_watch()-->unregister_mtwatch()-->unregister_all_mtwatch() in
> [PATCH 2/6] xenbus: implement the xenwatch multithreading framework.
>
> To traverse a hash table is not as efficient as traversing a list. That's why a
> domain is kept on both the hash table and list.
Keeping the same object on two lists also has costs. More importantly
IMO, it increases chances on introducing a bugĀ when people update one
instance but not the other.
>
>>> + *
>>> + * While there may exist two mtwatch_domain with the same domid on
>>> + * domain_list simultaneously,
>>
>> How is it possible to have two domids on the list at the same time? Don't you
>> want to remove it (which IIUIC means moving it to the purge list) when domain is
>> destroyed?
> Here is one case (suppose the domid/frontend-id is 9):
>
> 1. Suppose the last pv driver device is removed from domid=9, and therefore the
> reference count of per-domU xenwatch thread for domid=9 (which we call as old
> thread below) should be removed. We remove it from hash table (it is left in the
> list).
>
> Here we cannot remove the domain from the list immediately because there might
> be pending events being processed by the corresponding per-domU xenwatch thread.
> If we remove it from the list while there is related watch being unregistered as
> mentioned for last question, we may hit page fault when processing watch event.
Don't you need to grab domain->domain_mutex to remove the driver?
Meaning that events for that mtwatch thread cannot be processed?
In any case, I think that having two mtwatch_domains for the same domain
should be avoided. (and if you keep the hash list only then this issue
gets resolved automatically ;-))
-boris
>
> 2. Now the administrator attaches new pv device to domid=9 immediately and
> therefore reference count is initially set to 1. The per-domU xenwatch thread
> for domid=9 (called new thread) is created again. It is inserted to both the
> hash table and list.
>
> 3. As the old thread for domid=9 might still be on the list, we would have two
> threads for domid=9 (one old one to be removed and one newly inserted one to be
> used by new pv devices).
>
> Dongli Zhang
>
>>
>> -boris
>>
>>
>>> + * all mtwatch_domain on hash_hash
>>> + * should have unique domid.
>>> + */
>>> + spinlock_t domain_lock;
>>> + struct hlist_head domain_hash[MTWATCH_HASH_SIZE];
>>> + struct list_head domain_list;
>>> +
>>> + /*
>>> + * When a per-domU kthread is going to be destroyed, it is put
>>> + * on the purge_list, and will be flushed by purge_work later.
>>> + */
>>> + struct work_struct purge_work;
>>> + spinlock_t purge_lock;
>>> + struct list_head purge_list;
>>> +};
>>
Powered by blists - more mailing lists