lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 24 Jun 2010 11:02:56 -0400
From:	Matt Turner <mattst88@...il.com>
To:	Michael Cree <mcree@...on.net.nz>
Cc:	Dave Airlie <airlied@...il.com>,
	FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>,
	linux-kernel@...r.kernel.org, linux-alpha@...r.kernel.org,
	rth@...ddle.net, ink@...assic.park.msu.ru,
	jbarnes@...tuousgeek.org, linux-pci@...r.kernel.org,
	dri-devel@...ts.freedesktop.org, alexdeucher@...il.com,
	jglisse@...hat.com
Subject: Re: Problems with alpha/pci + radeon/ttm

On Thu, Jun 24, 2010 at 5:51 AM, Michael Cree <mcree@...on.net.nz> wrote:
> On 22/06/10 20:32, Dave Airlie wrote:
>>
>> On Tue, Jun 22, 2010 at 3:59 PM, FUJITA Tomonori
>> <fujita.tomonori@....ntt.co.jp>  wrote:
>>>
>>> On Mon, 21 Jun 2010 17:19:43 -0400
>>> Matt Turner<mattst88@...il.com>  wrote:
>>>
>>>> Michael Cree and I have been debugging FDO bug 26403 [1]. I tried
>>>> booting with `radeon.test=1` and found this, which I think is related:
>
> Note that my radeon card is PCI whereas I think Matt may be using an AGP
> card.

Actually, I'm using a plain Radeon 9100 PCI.

> My logs are very similar to Matt's except I don't see the following line:
>
>>>>> pci_map_single failed: could not allocate dma page tables
>
>
>>> This happens in the latest git, right?
>
> Indeed, testing 2.6.35-rc3 (plus a couple or so extra patches to fix
> unrelated compile errors).
>
>>> Is this a regression (what kernel version worked)?
>>>
>>> Seems that the IOMMU can't find 128 pages. It's likely due to:
>>>
>>> - out of the IOMMU space (possibly someone doesn't free the IOMMU
>>>  space).
>>>
>>> or
>>>
>>> - the mapping parameters (such as align) aren't appropriate so the
>>>  IOMMU can't find space.
>>
>> I don't think KMS drivers have ever worked on alpha so its not a
>> regression, they are working fine on x86 + powerpc and sparc has been
>> run at least once.
>
> KMS on the console boot up has worked since about 2.6.32, but starting up
> the X server has always failed and, in my case, the system becomes unstable
> and eventually OOPs.
>
>> I suspect we are simply hitting the limits of the iommu, how big an
>> address space does it handle? since generally graphics drivers try to
>> bind a lot of things to the GART.
>
> No idea on the address space limit.  I applied the patch of Fujita that logs
> all IOMMU allocations, and also inserted some extra printks in the ttm
> kernel code so that I could see which routines failed and the error code
> returned.  Running the radeon test on boot exhibits the following:
>
> [  238.712768] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset
> 0x1a312000
> [  239.281127] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset
> 0x1a412000
> [  239.281127] ttm_tt_bind belched -12
> [  239.282104] ttm_bo_handle_move_mem belched -12
> [  239.282104] ttm_bo_move_buffer belched -12
> [  239.282104] ttm_bo_validate belched -12
> [  239.282104] radeon 0000:01:00.0: object_init failed for (1048576,
> 0x00000002) err=-12
> [  239.282104] [drm:radeon_test_moves] *ERROR* Failed to create GTT object
> 419
> [  239.399291] Error while testing BO move.
>
> Note that no IOMMU allocations are printed while radeon_test_moves is
> running so iommu_arena_alloc doesn't appear to be called.  Also the error
> code returned up to radeon_test_moves is -12 which is ENOMEM.  So does
> appear to be some memory limit.

I confirm that we're getting -ENOMEM. I don't know if it's coming from
radeon_gart_bind(), but if it is there's an interesting comment
immediately after the call to pci_map_page:

if (pci_dma_mapping_error(rdev->pdev, rdev->gart.pages_addr[p])) {
            /* FIXME: failed to map page (return -ENOMEM?) */
            radeon_gart_unbind(rdev, offset, pages);
            return -ENOMEM;
}

>> It might be worth limiting the PCIGART in radeon to 32MB to see if the
>> lower limit helps.
>
> So, how does one do that?

Boot with `radeon.test=1 radeon.gartsize=<size in MB>`.
> Cheers
> Michael.

Thanks,
Matt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ