lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 24 Apr 2015 10:59:16 +0200
From:	Vitaly Kuznetsov <vkuznets@...hat.com>
To:	Dexuan Cui <decui@...rosoft.com>
Cc:	KY Srinivasan <kys@...rosoft.com>,
	Haiyang Zhang <haiyangz@...rosoft.com>,
	"devel\@linuxdriverproject.org" <devel@...uxdriverproject.org>,
	"linux-kernel\@vger.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 6/6] Drivers: hv: vmbus: do a fair round robin when selecting an outgoing channel

Dexuan Cui <decui@...rosoft.com> writes:

>> -----Original Message-----
>> From: Vitaly Kuznetsov [mailto:vkuznets@...hat.com]
>> Sent: Tuesday, April 21, 2015 22:28
>> To: KY Srinivasan
>> Cc: Haiyang Zhang; devel@...uxdriverproject.org; linux-
>> kernel@...r.kernel.org; Dexuan Cui
>> Subject: [PATCH 6/6] Drivers: hv: vmbus: do a fair round robin when
>> selecting an outgoing channel
>> 
>> vmbus_get_outgoing_channel() implements the following algorithm for
>> selecting
>> an outgoing channel (despite the comment before the function saying it
>> distributes the load equally):
>
> Yeah, I also found the issue. 
>
>> 1) If we have no subchannels return the primary channel;
>> 2) If primary->next_oc is grater than primary->num_sc reset the primary-
>> >next_oc
>>    to 0 and return the primary channel;
>> 3) Aim for the primary->next_oc subchannel, increment primary->next_oc;
>> 4) Loop through all opened subchannels. If we see a channel which has
>>  target_cpu == current_cpu return it. If we reached the primary->next_oc'th
>> open subchannel return it;
>> 5) Return the primary channel.
>> The implementation also skips the subchannel No. 0 unless it matches the
>> current
>> cpu as we assign i to 1 in the initialization.
>> 
>> This is not a fair round robin as subchannels in the beginning of the list are
>> more likely to be returned and checking for current cpu aslo creates
>
> I suppose the current algorithm is trying to make use of cache locality?
> KY may share more information.
>
>> additional
>> complexity. Simplify the vmbus_get_outgoing_channel() function, make it
>> do what the comment before it says.
>
> Hi Vitaly,
> It looks your algorithm also has an issue:
> Assuming primary->num_sc == 3 (SC1, SC2, SC3)
> 1st time: we choose SC1 and primary->next_oc is set to 1.
> 2nd time: we choose SC2 and primary->next_oc is set to 2.
> 3rd time: we choose SC3 and primary->next_oc is set to 3.
> 4th time and later:  since i varies among 1~3 and can't be bigger than 3,
> we always choose the primary channel.

You're right, it seems 'if (primary->next_oc > primary->num_sc)' is
off-by-one, it should be >=

>
> BTW, IMO it's not easy to achieve complete fairness because 
> vmbus_get_outgoing_channel() can run simultaneously on different
> CPUs (so primary->next_oc can be modified at the same time by multiple
> CPUs), and we believe it should be lockless.
> Maybe atomic_t can help(?)

This thing bothered me a bit but then I realized that we don't actually
care - doing a mistake once is better than suffering from the slowness
of locks/atomics. I'd suggest we keep it this way (with the fix
mentioned above).

Thanks!

>
> Thanks,
> -- Dexuan
>
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@...hat.com>
>> ---
>>  drivers/hv/channel_mgmt.c | 27 +++++++++------------------
>>  1 file changed, 9 insertions(+), 18 deletions(-)
>> 
>> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
>> index daa6417..df82442 100644
>> --- a/drivers/hv/channel_mgmt.c
>> +++ b/drivers/hv/channel_mgmt.c
>> @@ -827,39 +827,30 @@ cleanup:
>>  struct vmbus_channel *vmbus_get_outgoing_channel(struct
>> vmbus_channel *primary)
>>  {
>>  	struct list_head *cur, *tmp;
>> -	int cur_cpu;
>>  	struct vmbus_channel *cur_channel;
>> -	struct vmbus_channel *outgoing_channel = primary;
>> -	int next_channel;
>> -	int i = 1;
>> +	int i = 0;
>> 
>>  	if (list_empty(&primary->sc_list))
>> -		return outgoing_channel;
>> +		return primary;
>> 
>> -	next_channel = primary->next_oc++;
>> -
>> -	if (next_channel > (primary->num_sc)) {
>> +	if (primary->next_oc > primary->num_sc) {
>>  		primary->next_oc = 0;
>> -		return outgoing_channel;
>> +		return primary;
>>  	}
>> 
>> -	cur_cpu = hv_context.vp_index[get_cpu()];
>> -	put_cpu();
>>  	list_for_each_safe(cur, tmp, &primary->sc_list) {
>> +		i++;
>>  		cur_channel = list_entry(cur, struct vmbus_channel, sc_list);
>>  		if (cur_channel->state != CHANNEL_OPENED_STATE)
>>  			continue;
>> 
>> -		if (cur_channel->target_vp == cur_cpu)
>> -			return cur_channel;
>> -
>> -		if (i == next_channel)
>> +		if (i > primary->next_oc) {
>> +			primary->next_oc = i;
>>  			return cur_channel;
>> -
>> -		i++;
>> +		}
>>  	}
>> 
>> -	return outgoing_channel;
>> +	return primary;
>>  }
>>  EXPORT_SYMBOL_GPL(vmbus_get_outgoing_channel);
>> 
>> --
>> 1.9.3

-- 
  Vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ