[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <z2w86802c441004061445q5a05832fmaece2723436f803f@mail.gmail.com>
Date: Tue, 6 Apr 2010 14:45:51 -0700
From: Yinghai Lu <yinghai@...nel.org>
To: Vivek Goyal <vgoyal@...hat.com>
Cc: Joerg Roedel <joerg.roedel@....com>,
Chris Wright <chrisw@...s-sol.org>,
Joerg Roedel <joro@...tes.org>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Bernhard Walle <bernhard@...lle.de>, nhorman@...hat.com,
nhorman@...driver.com, kexec@...ts.infradead.org,
linux-kernel@...r.kernel.org, hbabu@...ibm.com,
iommu@...ts.linux-foundation.org
Subject: Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash"
On Tue, Apr 6, 2010 at 2:13 PM, Vivek Goyal <vgoyal@...hat.com> wrote:
> On Tue, Apr 06, 2010 at 04:39:56PM -0400, Vivek Goyal wrote:
>> On Tue, Apr 06, 2010 at 07:51:06PM +0200, Joerg Roedel wrote:
>> > On Tue, Apr 06, 2010 at 10:42:57AM -0700, Chris Wright wrote:
>> > > * Joerg Roedel (joro@...tes.org) wrote:
>> > > > On Sun, Apr 04, 2010 at 02:44:36AM -0700, Eric W. Biederman wrote:
>> > > > > Joerg Roedel <joro@...tes.org> writes:
>> > > > >
>> > > > > > On Sun, Apr 04, 2010 at 09:24:30AM +0200, Bernhard Walle wrote:
>> > > > > >> Am 03.04.10 19:49, schrieb Eric W. Biederman:
>> > > > > >> > Not a problem. We require a lot of things of the kdump kernel,
>> > > > > >> > and it is immediately apparent in a basic sanity test.
>> > > > > >>
>> > > > > >> Also, in most cases (for example: distribution kernels), the kdump
>> > > > > >> kernel nowadays is identical to the running kernel. So, if the running
>> > > > > >> kernel has IOMMU support, the kdump kernel also has.
>> > > > > >
>> > > > > > Yes, I know. But is that a requirement for kexec?
>> > > > >
>> > > > > For normal kexec no. That path is expected to do a clean hardware
>> > > > > shutdown.
>> > > > >
>> > > > > For kexec on panic aka kdump the requirement is that your your crash
>> > > > > kernel be able to initialize your hardware from any state it can be
>> > > > > put in.
>> > > >
>> > > > Ok, if you show me where this is documented for everybody then I am
>> > > > probably convinced :-)
>> > > > We should fixup the gart initialization anyway.
>> > >
>> > > So, you planning to pull in all 4 patches then?
>> >
>> > Yes, I will apply them tomorrow and write a fix for the GART issue this
>> > may introduce.
>> >
>>
>> Hi Joerg,
>>
>> Going through the old mail thread, I think the commit you pointed to was
>> primarily introduced to solve kexec + GART issue and not necessarily kdump
>> issue.
>>
>> In fact disabling IOMMU patch was introduced by you.
>>
>> Author: Joerg Roedel <joerg.roedel@....com>
>> Date: Tue Jun 9 17:56:09 2009 +0200
>>
>> x86: disable IOMMUs on kernel crash
>>
>> If the IOMMUs are still enabled when the kexec kernel boots access to
>> the disk is not possible. This is bad for tools like kdump or anything
>> else which wants to use PCI devices.
>>
>> Signed-off-by: Joerg Roedel <joerg.roedel@....com>
>>
>> I am assuming you introduced this patch because you faced issues with
>> amd-iommu and not GART.
>>
>> So basically GART should have been working with kdump even before you
>> introduced disabling iommu patch in kdump path.
>
> Looking at following commit, we were still not shutting down GART and
> fixing issues like second kernel accessing the GART aperture set by first
> kernel.
>
> commit aaf230424204864e2833dcc1da23e2cb0b9f39cd
> Author: Yinghai Lu <Yinghai.Lu@....COM>
> Date: Wed Jan 30 13:33:09 2008 +0100
>
> x86: disable the GART early, 64-bit
>
> For K8 system: 4G RAM with memory hole remapping enabled, or more than
> 4G RAM installed.
>
> So I guess it should be fine to not shutdown GART in crashing kernel and
> then look at the fresh issues which crop up and figure out how to fix
> those.
not sure if it is related:
for crashing kernel, it could do early_memtest to check if some device
are still do dma operation.
When I use kexec to start second kernel, if enable the early_memtest
in second kernel, it will find some pages RAM are BAD,
and it will mark them and not use them. memtest=1 should be good enough.
Fresh restart will not report there is any BAD ram in the same system.
YH
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists