[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53E27F94.7000903@citrix.com>
Date: Wed, 6 Aug 2014 20:18:44 +0100
From: Zoltan Kiss <zoltan.kiss@...rix.com>
To: Wei Liu <wei.liu2@...rix.com>
CC: Ian Campbell <Ian.Campbell@...rix.com>,
David Vrabel <david.vrabel@...rix.com>,
<netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<xen-devel@...ts.xenproject.org>
Subject: Re: [PATCH net-next 2/2] xen-netback: Turn off the carrier if the
guest is not able to receive
On 05/08/14 13:45, Wei Liu wrote:
> On Mon, Aug 04, 2014 at 04:20:58PM +0100, Zoltan Kiss wrote:
> [...]
>> struct xenvif {
>> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
>> index fbdadb3..48a55cd 100644
>> --- a/drivers/net/xen-netback/interface.c
>> +++ b/drivers/net/xen-netback/interface.c
>> @@ -97,7 +101,16 @@ int xenvif_poll(struct napi_struct *napi, int budget)
>> static irqreturn_t xenvif_rx_interrupt(int irq, void *dev_id)
>> {
>> struct xenvif_queue *queue = dev_id;
>> + struct netdev_queue *net_queue =
>> + netdev_get_tx_queue(queue->vif->dev, queue->id);
>>
>> + /* QUEUE_STATUS_RX_PURGE_EVENT is only set if either QDisc was off OR
>> + * the carrier went down and this queue was previously blocked
>> + */
>
> Could you change "blocked" to "stalled" so that the comment matches the
> code closely?
Ok
>> @@ -1935,6 +1934,75 @@ static void xenvif_start_queue(struct xenvif_queue *queue)
>> xenvif_wake_queue(queue);
>> }
>>
>> +/* Only called from the queue's thread, it handles the situation when the guest
>> + * doesn't post enough requests on the receiving ring.
>> + * First xenvif_start_xmit disables QDisc and start a timer, and then either the
>> + * timer fires, or the guest send an interrupt after posting new request. If it
>> + * is the timer, the carrier is turned off here.
>> + * */
>
> Please remove that extra "*".
Ok
>> +static void xenvif_rx_purge_event(struct xenvif_queue *queue)
>> +{
>> + /* Either the last unsuccesful skb or at least 1 slot should fit */
>> + int needed = queue->rx_last_skb_slots ?
>> + queue->rx_last_skb_slots : 1;
>> +
>> + /* It is assumed that if the guest post new slots after this, the RX
>> + * interrupt will set the QUEUE_STATUS_RX_PURGE_EVENT bit and wake up
>> + * the thread again
>> + */
>
> Basically in this state machine you have a tuple (RX_STALLED bit,
> PURGE_EVENT bit, carrier state). This whole state transaction is very
> scary, any chance you can draw a graph like the xenbus state machine in
> xenbus.c?
>
> I fear that after three month noone can easily understand this code
> unless he / she spends half a day reading the code. And without defining
> what state is legal it's very hard to tell what behavior is expected and
> what is not.
Ok
>
>> + set_bit(QUEUE_STATUS_RX_STALLED, &queue->status);
>> + if (!xenvif_rx_ring_slots_available(queue, needed)) {
>> + rtnl_lock();
>> + if (netif_carrier_ok(queue->vif->dev)) {
>> + /* Timer fired and there are still no slots. Turn off
>> + * everything except the interrupts
>> + */
>> + netif_carrier_off(queue->vif->dev);
>> + skb_queue_purge(&queue->rx_queue);
>> + queue->rx_last_skb_slots = 0;
>> + if (net_ratelimit())
>> + netdev_err(queue->vif->dev, "Carrier off due to lack of guest response on queue %d\n", queue->id);
>
> Line too long.
Ok
>> @@ -1944,8 +2012,12 @@ int xenvif_kthread_guest_rx(void *data)
>> wait_event_interruptible(queue->wq,
>> rx_work_todo(queue) ||
>> queue->vif->disabled ||
>> + test_bit(QUEUE_STATUS_RX_PURGE_EVENT, &queue->status) ||
>
> Line too long.
Ok
>
>> kthread_should_stop());
>>
>> + if (kthread_should_stop())
>> + break;
>> +
>> /* This frontend is found to be rogue, disable it in
>> * kthread context. Currently this is only set when
>> * netback finds out frontend sends malformed packet,
>> @@ -1955,24 +2027,21 @@ int xenvif_kthread_guest_rx(void *data)
>> */
>> if (unlikely(queue->vif->disabled && queue->id == 0))
>> xenvif_carrier_off(queue->vif);
>
> I think you also need to check vif->disabled flag in your following code
> so that you don't accidently re-enable a rogue vif in a queue whose id
> != 0.
Yes.
>
> Further more "disabled" can be transformed to a bit in vif->status.
> You can incorporate such change in your previous patch or a separate
> prerequisite patch.
Yes, I've already done that on my non-multiqueue branch.
>
>> -
>> - if (kthread_should_stop())
>> - break;
>> -
>> - if (queue->rx_queue_purge) {
>> + else if (unlikely(test_and_clear_bit(QUEUE_STATUS_RX_PURGE_EVENT,
>> + &queue->status))) {
>> + xenvif_rx_purge_event(queue);
>> + } else if (!netif_carrier_ok(queue->vif->dev)) {
>> + /* Another queue stalled and turned the carrier off, so
>> + * purge the internal queue of queues which were not
>> + * blocked
>> + */
>
> "blocked" -> "stalled"?
Ok
>
> In theory even one queue stalls all other queues can still make
> progress, isn't it?
This patch makes sure that if a queue is stalled, none of the others can
transmit, even if they would be able to do so. It is documented at the
definition of QUEUE_STATUS_RX_STALLED.
>
> Wei.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists