[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <EA929A9653AAE14F841771FB1DE5A1365FFE1B19CD@rrsmsx501.amr.corp.intel.com>
Date: Fri, 16 Jul 2010 17:23:55 -0600
From: "Tantilov, Emil S" <emil.s.tantilov@...el.com>
To: Maxim Levitsky <maximlevitsky@...il.com>
CC: "Kirsher, Jeffrey T" <jeffrey.t.kirsher@...el.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"Allan, Bruce W" <bruce.w.allan@...el.com>,
"Pieper, Jeffrey E" <jeffrey.e.pieper@...el.com>
Subject: RE: [REGRESSION] e1000e stopped working [MANUALLY BISECTED]
Maxim Levitsky wrote:
> On Thu, 2010-07-15 at 22:09 +0300, Maxim Levitsky wrote:
>> On Thu, 2010-07-15 at 13:02 -0600, Tantilov, Emil S wrote:
>>> Maxim Levitsky wrote:
>>>> On Thu, 2010-07-15 at 02:33 +0300, Maxim Levitsky wrote:
>>>>> On Wed, 2010-07-14 at 16:56 -0600, Tantilov, Emil S wrote:
>>>>>> Maxim Levitsky wrote:
>>>>>>> On Mon, 2010-07-12 at 15:23 -0600, Tantilov, Emil S wrote:
>>>>>>>> Maxim Levitsky wrote:
>>>>>>>>> On Mon, 2010-07-05 at 12:58 +0300, Maxim Levitsky wrote:
>>>>>>>>>> On Mon, 2010-07-05 at 01:13 -0700, Jeff Kirsher wrote:
>>>>>>>>>>> On Sun, Jul 4, 2010 at 15:48, Maxim Levitsky
>>>>>>>>>>> <maximlevitsky@...il.com> wrote:
>>>>>>>>>>>> Did few guesses, and now I see that reverting the below
>>>>>>>>>>>> commit fixes the problem.
>>>>>>>>>>>>
>>>>>>>>>>>> "e1000e: Fix/cleanup PHY reset code for ICHx/PCHx"
>>>>>>>>>>>> e98cac447cc1cc418dff1d610a5c79c4f2bdec7f.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>> Maxim Levitsky
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>> Can you give us till Tuesday to respond? I know that there
>>>>>>>>>>> are some additional e1000e patches in my queue, which may
>>>>>>>>>>> resolve the issue, but this weekend the power is down to do
>>>>>>>>>>> some infrastructure upgrades which prevents us from doing
>>>>>>>>>>> any investigation.debugging until Tuesday.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Sure.
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>> Maxim Levitsky
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Updates?
>>>>>>>>
>>>>>>>> We are working on reproducing the issue. So far we have not
>>>>>>>> seen the problem when testing with net-next.
>>>>>>>>
>>>>>>>> I asked in previous email about some additional info from
>>>>>>>> ethtool (-d, -e, -S) and kernel config. That would help us to
>>>>>>>> narrow it down.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Emil
>>>>>>> I did send -e and -d output.
>>>>>>
>>>>>> Sorry, looks like I lost the email with the attachements.
>>>>>>
>>>>>> Could you provide the output of dmesg after the failure occurs?
>>>>>>
>>>>>>> Since you probably want -S output during failure, I need to
>>>>>>> recompile kernel for that. I will do that soon.
>>>>>>>
>>>>>>>
>>>>>>> One question, in two weeks I hope 2.6.35 won't be released?
>>>>>>> If so, I will have enough free time then to narrow down this
>>>>>>> issue.
>>>>>>>
>>>>>>> Other solution, is to revert this commit.
>>>>>>> (I have never seen this problem with it reverted).
>>>>>>
>>>>>> We have been running reboot tests on 2 separate systems with
>>>>>> recent net-next kernels using your config and so far no luck in
>>>>>> reproducing this issue.
>>>>>>
>>>>>> What is the make model of your system (or MB)?
>>>>>
>>>>> the motherboard is Intel DG965RY.
>>>>>
>>>>> However, I am using vanilla kernel.
>>>>> net-next might contain further fixes.
>>>>>
>>>>> I see if net-next works here.
>>>>
>>>> Yep, net-next works here.
>>>>
>>>>
>>>> I have the problem on vanilla kernel.
>>>> Last revision of it, I tested is 2.6.35-rc4 exactly
>>>> (815c4163b6c8ebf8152f42b0a5fd015cfdcedc78)
>>>>
>>>>
>>>> Maybe vanilla git master works, I test it too soon.
>>>
>>> Thanks for the information! Good to know that this issue does not
>>> exist in the latest branch.
>>>
>>> Have you by any chance tested a stable branch (2.6.34.x)?
>>
>> I only did test plain 2.6.34 (v2.6.34)
> And forgot to add, that it did work.
>
>>
>> Also I repeat that revert of e98cac447cc1cc418dff1d610a5c79c4f2bdec7f
>> (e1000e: Fix/cleanup PHY reset code for ICHx/PCHx) fixes the bug on
>> vanilla kernel.
>>
>> Also I just pulled latest vanilla git, and I according to diffstat I
>> see no changes in e1000e, so its likely that bug remains there.
>> I will test that soon.
> Tested, broken as expected.
That makes sense. Unfortunately we are still not able to reproduce even on recent pull from Linus tree.
If you want - you can look at the patches for e1000e in net-next and start applying those to your tree until the issue is resolved.
I will keep trying it here, but none of the systems we have exhibit the issue you described, so the bug could be exposed by something in your system/config.
Thanks,
Emil
Powered by blists - more mailing lists