[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A4D4BDC.8010304@gmail.com>
Date: Fri, 03 Jul 2009 02:07:56 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Ayaz Abdulla <aabdulla@...dia.com>
CC: David Miller <davem@...emloft.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [GIT]: Networking
Ayaz Abdulla a écrit :
>
>
> Eric Dumazet wrote:
>> Ayaz Abdulla a écrit :
>>
>>>
>>> Eric Dumazet wrote:
>>>
>>>> Eric Dumazet a écrit :
>>>>
>>>>
>>>>> Ingo Molnar a écrit :
>>>>>
>>>>>
>>>>>>> The following changes since commit
>>>>>>> 52989765629e7d182b4f146050ebba0abf2cb0b7:
>>>>>>> Linus Torvalds (1):
>>>>>>> Merge git://git.kernel.org/.../davem/net-2.6
>>>>>>>
>>>>>>> are available in the git repository at:
>>>>>>>
>>>>>>> master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master
>>>>>>
>>>>>> Hm, something in this lot quickly wrecked networking here - see the
>>>>>> tx timeout dump below. It starts with:
>>>>>>
>>>>>> [ 351.004596] WARNING: at net/sched/sch_generic.c:246
>>>>>> dev_watchdog+0x10b/0x19c()
>>>>>> [ 351.011815] Hardware name: System Product Name
>>>>>> [ 351.016220] NETDEV WATCHDOG: eth0 (forcedeth): transmit queue 0
>>>>>> timed out
>>>>>>
>>>>>> Config attached. Unfortunately i've got no time to do bisection
>>>>>> today.
>>>>>
>>>>>
>>>>>
>>>>> forcedeth might have a problem, in its netif_wake_queue() logic, but
>>>>> I could not see why a recent patch could make this problem visible
>>>>> now.
>>>>>
>>>>> CPU0/1: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ stepping 02
>>>>> is not a new cpu either :)
>>>>>
>>>>> forcedeth uses an internal tx_stop without appropriate barrier.
>>>>>
>>>>> Could you try following patch ?
>>>>>
>>>>> (random guess as I dont have much time right now)
>>>>
>>>>
>>>> Oh well this patch was soooo stupid, sorry Ingo.
>>>>
>>>>
>>>> We might have a race in napi_schedule(), leaving interrupts disabled
>>>> forever.
>>>> I cannot test this patch, I dont have the hardware...
>>>>
>>>> Thanks
>>>>
>>>> diff --git a/drivers/net/forcedeth.c b/drivers/net/forcedeth.c
>>>> index 1094d29..3b4e076 100644
>>>> --- a/drivers/net/forcedeth.c
>>>> +++ b/drivers/net/forcedeth.c
>>>> @@ -3514,11 +3514,13 @@ static irqreturn_t nv_nic_irq(int foo, void
>>>> *data)
>>>> nv_msi_workaround(np);
>>>>
>>>> #ifdef CONFIG_FORCEDETH_NAPI
>>>> - napi_schedule(&np->napi);
>>>> -
>>>> - /* Disable furthur irq's
>>>> - (msix not enabled with napi) */
>>>> - writel(0, base + NvRegIrqMask);
>>>> + if (napi_schedule_prep(&np->napi)) {
>>>> + /*
>>>> + * Disable further irq's (msix not enabled with napi)
>>>> + */
>>>> + writel(0, base + NvRegIrqMask);
>>>> + __napi_schedule(&np->napi);
>>>> + }
>>>
>>> Yes, good catch. There is a race condition here with napi poll.
>>>
>>> I would prefer to do the following to keep the code simple and clean.
>>>
>>> writel(0, base + NvRegIrqMask);
>>> napi_schedule(&np->napi);
>>
>>
>>
>> CC trimmed down to network devs only :)
>>
>> It would be racy too ...
>>
>> check drivers/net/amd8111e.c, drivers/net/natsemi.c ...
>>
>> If this cpu inconditionaly calls writel(0, base + NvRegIrqMask); while
>> another cpu just called writel(np->irqmask, base + NvRegIrqMask),
>> we end with disabled interrupts ?
> <BCC linux-kernel>
>
> Yes, but the next instruction is to call napi_schedule() which will
> re-enable interrupts in napi function.
Sorry, I dont get it. You are referring to softirqs, while I speak
of hardware (NIC) interrupts, that can be masked for ever.
With your patch :
CPU0 CPU1
napi_complete();
/* please note there is no smp_wmb()
in __napi_complete() after
clear_bit(NAPI_STATE_SCHED, &n->state);
(It's implied on x86 because of lock prefix)
*/
writel(np->irqmask, base + NvRegIrqMask)
writel(0, base + NvRegIrqMask);
napi_schedule(&np->napi); // might do nothing because this cpu see
// NAPI_STATE_SCHED set to one
NIC cannot deliver further interrupts.
With my patch :
CPU0 CPU1
napi_complete();
/* please note there is no smp_wmb()
in __napi_complete() after
clear_bit(NAPI_STATE_SCHED, &n->state);
*/
writel(np->irqmask, base + NvRegIrqMask)
if (napi_schedule_prep(&np->napi)) // if false, NIC can deliver further interrupts.
// if true, we mask interrupts but re-enable napi. packets will be processed,
// interrupt will be re-enabled later.
My patch has a guarantee that we mask NIC interrupts *only* if napi is re-scheduled.
It probably makes no difference on x86, but it might be better to have same logic in all drivers...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists