[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+5PVA5ApepOTJGJR0zXF5C_p7cFjShr3_shynAHX_M=ya5o7Q@mail.gmail.com>
Date: Thu, 28 Feb 2013 13:59:25 -0500
From: Josh Boyer <jwboyer@...il.com>
To: Alex Deucher <alexdeucher@...il.com>
Cc: Dave Airlie <airlied@...ux.ie>,
Alex Deucher <alexander.deucher@....com>,
Jerome Glisse <jglisse@...hat.com>,
torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
DRI mailing list <dri-devel@...ts.freedesktop.org>
Subject: Re: [git pull] drm merge for 3.9-rc1
On Thu, Feb 28, 2013 at 10:15 AM, Josh Boyer <jwboyer@...il.com> wrote:
> On Thu, Feb 28, 2013 at 10:09 AM, Alex Deucher <alexdeucher@...il.com> wrote:
>> On Thu, Feb 28, 2013 at 8:44 AM, Josh Boyer <jwboyer@...il.com> wrote:
>>> On Thu, Feb 28, 2013 at 8:38 AM, Alex Deucher <alexdeucher@...il.com> wrote:
>>>>>>>> ca57802e521de54341efc8a56f70571f79ffac72 is the first bad commit
>>>>>>>
>>>>>>> So I don't think that's actually the cause of the problem. Or at least
>>>>>>> not that alone. I reverted it on top of Linus' latest tree and I still
>>>>>>> get the lockups.
>>>>>>
>>>>>> Actually, git bisect does seem to have gotten it correct. Once I
>>>>>> actually tested the revert of just that on top of Linus' tree (commit
>>>>>> d895cb1af1), things seem to be working much better. I've rebooted a
>>>>>> dozen times without a lockup. The most I've seen it take on a kernel
>>>>>> with that commit included is 3 reboots, so that's definitely at least an
>>>>>> improvement.
>>>>>
>>>>> I give up. GPU issues are not my thing. 2 reboots after I sent that it
>>>>> gave me pretty rainbow static again. So it might have been an
>>>>> improvement, but revert it is not a solution.
>>>>>
>>>>> Looking at there rest of the commits, the whole GPU rework might be
>>>>> suspect, but I clearly have no clue.
>>>>
>>>> GPUs are tricky beasts :)
>>>
>>> Understatement ;).
>>>
>>>> ca57802e521de54341efc8a56f70571f79ffac72 mostly likely wasn't the
>>>> problem anyway since it only affects 6xx/7xx and your card is handled
>>>> by the evergreen code. I'll put together some patches to help narrow
>>>> down the problem.
>>>
>>> Yeah, that's the biggest problem I have, not knowing which functions are
>>> actually being executed for this card. It looks like a combination of
>>> stuff in evergreen.c and ni.c, but I have no idea.
>>>
>>> Patches would be great. If nothing else, I'm really good at building
>>> kernels and rebooting by now.
>>
>> Two possible fixes attached. The first attempts a full reset of all
>> blocks if the MC (memory controller) is hung. That may work better
>> than just resetting the MC. The second just disables MC reset. I'm
>> not sure we can reliably tell if it's busy due to display requests
>> hitting the MC periodically which would lead to needlessly resetting
>> it possibly leading to failures like you are seeing.
>
> OK. I'll test them individually. It will probably take a bit because
> I'll want to do numerous reboots if things seem "fixed" with one or the
> other.
>
> I'll let you know how things go.
I applied each individually on top of Linus' tree as of this morning
(commit 2a7d2b96d5) built, installed, and tested.
0001-drm-radeon-XXX-try-a-full-reset-if-the-MC-is-busy.patch failed in
two reboots.
0001-drm-radeon-XXX-skip-MC-reset-as-it-s-probably-not-hu.patch has gone
21 reboots without a hang/rainbow static. You'll understand if I'm
hesitant to declare success, but resetting the MC does indeed appear to
be the issue. I'll keep rebooting for a while to make sure.
josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists