lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53E27F94.7000903@citrix.com>
Date:	Wed, 6 Aug 2014 20:18:44 +0100
From:	Zoltan Kiss <zoltan.kiss@...rix.com>
To:	Wei Liu <wei.liu2@...rix.com>
CC:	Ian Campbell <Ian.Campbell@...rix.com>,
	David Vrabel <david.vrabel@...rix.com>,
	<netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<xen-devel@...ts.xenproject.org>
Subject: Re: [PATCH net-next 2/2] xen-netback: Turn off the carrier if the
 guest is not able to receive

On 05/08/14 13:45, Wei Liu wrote:
> On Mon, Aug 04, 2014 at 04:20:58PM +0100, Zoltan Kiss wrote:
> [...]
>>   struct xenvif {
>> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
>> index fbdadb3..48a55cd 100644
>> --- a/drivers/net/xen-netback/interface.c
>> +++ b/drivers/net/xen-netback/interface.c
>> @@ -97,7 +101,16 @@ int xenvif_poll(struct napi_struct *napi, int budget)
>>   static irqreturn_t xenvif_rx_interrupt(int irq, void *dev_id)
>>   {
>>   	struct xenvif_queue *queue = dev_id;
>> +	struct netdev_queue *net_queue =
>> +		netdev_get_tx_queue(queue->vif->dev, queue->id);
>>
>> +	/* QUEUE_STATUS_RX_PURGE_EVENT is only set if either QDisc was off OR
>> +	 * the carrier went down and this queue was previously blocked
>> +	 */
>
> Could you change "blocked" to "stalled" so that the comment matches the
> code closely?
Ok

>> @@ -1935,6 +1934,75 @@ static void xenvif_start_queue(struct xenvif_queue *queue)
>>   		xenvif_wake_queue(queue);
>>   }
>>
>> +/* Only called from the queue's thread, it handles the situation when the guest
>> + * doesn't post enough requests on the receiving ring.
>> + * First xenvif_start_xmit disables QDisc and start a timer, and then either the
>> + * timer fires, or the guest send an interrupt after posting new request. If it
>> + * is the timer, the carrier is turned off here.
>> + * */
>
> Please remove that extra "*".
Ok

>> +static void xenvif_rx_purge_event(struct xenvif_queue *queue)
>> +{
>> +	/* Either the last unsuccesful skb or at least 1 slot should fit */
>> +	int needed = queue->rx_last_skb_slots ?
>> +		     queue->rx_last_skb_slots : 1;
>> +
>> +	/* It is assumed that if the guest post new slots after this, the RX
>> +	 * interrupt will set the QUEUE_STATUS_RX_PURGE_EVENT bit and wake up
>> +	 * the thread again
>> +	 */
>
> Basically in this state machine you have a tuple (RX_STALLED bit,
> PURGE_EVENT bit, carrier state). This whole state transaction is very
> scary, any chance you can draw a graph like the xenbus state machine in
> xenbus.c?
>
> I fear that after three month noone can easily understand this code
> unless he / she spends half a day reading the code. And without defining
> what state is legal it's very hard to tell what behavior is expected and
> what is not.
Ok

>
>> +	set_bit(QUEUE_STATUS_RX_STALLED, &queue->status);
>> +	if (!xenvif_rx_ring_slots_available(queue, needed)) {
>> +		rtnl_lock();
>> +		if (netif_carrier_ok(queue->vif->dev)) {
>> +			/* Timer fired and there are still no slots. Turn off
>> +			 * everything except the interrupts
>> +			 */
>> +			netif_carrier_off(queue->vif->dev);
>> +			skb_queue_purge(&queue->rx_queue);
>> +			queue->rx_last_skb_slots = 0;
>> +			if (net_ratelimit())
>> +				netdev_err(queue->vif->dev, "Carrier off due to lack of guest response on queue %d\n", queue->id);
>
> Line too long.
Ok


>> @@ -1944,8 +2012,12 @@ int xenvif_kthread_guest_rx(void *data)
>>   		wait_event_interruptible(queue->wq,
>>   					 rx_work_todo(queue) ||
>>   					 queue->vif->disabled ||
>> +					 test_bit(QUEUE_STATUS_RX_PURGE_EVENT, &queue->status) ||
>
> Line too long.
Ok

>
>>   					 kthread_should_stop());
>>
>> +		if (kthread_should_stop())
>> +			break;
>> +
>>   		/* This frontend is found to be rogue, disable it in
>>   		 * kthread context. Currently this is only set when
>>   		 * netback finds out frontend sends malformed packet,
>> @@ -1955,24 +2027,21 @@ int xenvif_kthread_guest_rx(void *data)
>>   		 */
>>   		if (unlikely(queue->vif->disabled && queue->id == 0))
>>   			xenvif_carrier_off(queue->vif);
>
> I think you also need to check vif->disabled flag in your following code
> so that you don't accidently re-enable a rogue vif in a queue whose id
> != 0.
Yes.
>
> Further more "disabled" can be transformed to a bit in vif->status.
> You can incorporate such change in your previous patch or a separate
> prerequisite patch.
Yes, I've already done that on my non-multiqueue branch.
>
>> -
>> -		if (kthread_should_stop())
>> -			break;
>> -
>> -		if (queue->rx_queue_purge) {
>> +		else if (unlikely(test_and_clear_bit(QUEUE_STATUS_RX_PURGE_EVENT,
>> +						     &queue->status))) {
>> +			xenvif_rx_purge_event(queue);
>> +		} else if (!netif_carrier_ok(queue->vif->dev)) {
>> +			/* Another queue stalled and turned the carrier off, so
>> +			 * purge the internal queue of queues which were not
>> +			 * blocked
>> +			 */
>
> "blocked" -> "stalled"?
Ok
>
> In theory even one queue stalls all other queues can still make
> progress, isn't it?
This patch makes sure that if a queue is stalled, none of the others can 
transmit, even if they would be able to do so. It is documented at the 
definition of QUEUE_STATUS_RX_STALLED.

>
> Wei.
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ