lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 24 Apr 2015 08:42:11 +0000
From:	Dexuan Cui <decui@...rosoft.com>
To:	Vitaly Kuznetsov <vkuznets@...hat.com>,
	KY Srinivasan <kys@...rosoft.com>
CC:	Haiyang Zhang <haiyangz@...rosoft.com>,
	"devel@...uxdriverproject.org" <devel@...uxdriverproject.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH 6/6] Drivers: hv: vmbus: do a fair round robin when
 selecting an outgoing channel

> -----Original Message-----
> From: Vitaly Kuznetsov [mailto:vkuznets@...hat.com]
> Sent: Tuesday, April 21, 2015 22:28
> To: KY Srinivasan
> Cc: Haiyang Zhang; devel@...uxdriverproject.org; linux-
> kernel@...r.kernel.org; Dexuan Cui
> Subject: [PATCH 6/6] Drivers: hv: vmbus: do a fair round robin when
> selecting an outgoing channel
> 
> vmbus_get_outgoing_channel() implements the following algorithm for
> selecting
> an outgoing channel (despite the comment before the function saying it
> distributes the load equally):

Yeah, I also found the issue. 

> 1) If we have no subchannels return the primary channel;
> 2) If primary->next_oc is grater than primary->num_sc reset the primary-
> >next_oc
>    to 0 and return the primary channel;
> 3) Aim for the primary->next_oc subchannel, increment primary->next_oc;
> 4) Loop through all opened subchannels. If we see a channel which has
>  target_cpu == current_cpu return it. If we reached the primary->next_oc'th
> open subchannel return it;
> 5) Return the primary channel.
> The implementation also skips the subchannel No. 0 unless it matches the
> current
> cpu as we assign i to 1 in the initialization.
> 
> This is not a fair round robin as subchannels in the beginning of the list are
> more likely to be returned and checking for current cpu aslo creates

I suppose the current algorithm is trying to make use of cache locality?
KY may share more information.


> additional
> complexity. Simplify the vmbus_get_outgoing_channel() function, make it
> do what the comment before it says.

Hi Vitaly,
It looks your algorithm also has an issue:
Assuming primary->num_sc == 3 (SC1, SC2, SC3)
1st time: we choose SC1 and primary->next_oc is set to 1.
2nd time: we choose SC2 and primary->next_oc is set to 2.
3rd time: we choose SC3 and primary->next_oc is set to 3.
4th time and later:  since i varies among 1~3 and can't be bigger than 3,
we always choose the primary channel.


BTW, IMO it's not easy to achieve complete fairness because 
vmbus_get_outgoing_channel() can run simultaneously on different
CPUs (so primary->next_oc can be modified at the same time by multiple
CPUs), and we believe it should be lockless.
Maybe atomic_t can help(?)

Thanks,
-- Dexuan

> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@...hat.com>
> ---
>  drivers/hv/channel_mgmt.c | 27 +++++++++------------------
>  1 file changed, 9 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> index daa6417..df82442 100644
> --- a/drivers/hv/channel_mgmt.c
> +++ b/drivers/hv/channel_mgmt.c
> @@ -827,39 +827,30 @@ cleanup:
>  struct vmbus_channel *vmbus_get_outgoing_channel(struct
> vmbus_channel *primary)
>  {
>  	struct list_head *cur, *tmp;
> -	int cur_cpu;
>  	struct vmbus_channel *cur_channel;
> -	struct vmbus_channel *outgoing_channel = primary;
> -	int next_channel;
> -	int i = 1;
> +	int i = 0;
> 
>  	if (list_empty(&primary->sc_list))
> -		return outgoing_channel;
> +		return primary;
> 
> -	next_channel = primary->next_oc++;
> -
> -	if (next_channel > (primary->num_sc)) {
> +	if (primary->next_oc > primary->num_sc) {
>  		primary->next_oc = 0;
> -		return outgoing_channel;
> +		return primary;
>  	}
> 
> -	cur_cpu = hv_context.vp_index[get_cpu()];
> -	put_cpu();
>  	list_for_each_safe(cur, tmp, &primary->sc_list) {
> +		i++;
>  		cur_channel = list_entry(cur, struct vmbus_channel, sc_list);
>  		if (cur_channel->state != CHANNEL_OPENED_STATE)
>  			continue;
> 
> -		if (cur_channel->target_vp == cur_cpu)
> -			return cur_channel;
> -
> -		if (i == next_channel)
> +		if (i > primary->next_oc) {
> +			primary->next_oc = i;
>  			return cur_channel;
> -
> -		i++;
> +		}
>  	}
> 
> -	return outgoing_channel;
> +	return primary;
>  }
>  EXPORT_SYMBOL_GPL(vmbus_get_outgoing_channel);
> 
> --
> 1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ