lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c31b6dfb-86be-419e-92af-61d9ad95c12f@amd.com>
Date: Mon, 1 Jul 2024 14:10:19 +0200
From: Christian König <christian.koenig@....com>
To: Jiaxun Yang <jiaxun.yang@...goat.com>, Icenowy Zheng <uwu@...nowy.me>,
 Huang Rui <ray.huang@....com>,
 Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
 Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
 David Airlie <airlied@...il.com>, Daniel Vetter <daniel@...ll.ch>
Cc: dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 2/2] drm/ttm: downgrade cached to write_combined when
 snooping not available

Am 01.07.24 um 13:52 schrieb Jiaxun Yang:
> 在2024年7月1日七月 下午12:40,Christian König写道:
> [...]
>> Ah, wait a second.
>>
>> Such a thing as non-coherent PCIe implementation doesn't exist. The PCIe
>> specification makes it mandatory for memory access to be cache coherent.
> There are some non-PCIe TTM GPU being hit by this pitfall, we have non-coherent
> Vivante GPU on some devices.

Yeah, but those are perfectly supported.

If you have a non PCIe device which needs uncached or write combined 
system memory allocations you can just specify this at memory allocation 
time.

> Handling it at TTM core makes more sense on reducing per-driver effort on dealing
> platform issues.
>
>> There are a bunch of non-compliant PCIe implementations which have
>> broken cache coherency, but those explicitly violate the specification
>> and because of that are not supported.
> I don't really understand, "doesn't exist" and "bunch of" seems to be contradicting
> with each other.

A compliant non-coherent PCIe implementation doesn't exists, that's made 
mandatory by the PCIe standard.

What does exists are some non compliant non coherent PCIe 
implementations, but as far as I know those are then not supported at 
all by Linux.

We already had tons of problems with platform chips which intentionally 
doesn't correctly implement the PCIe specification because it is cheaper.

At least for AMDs GPU driver we reverted to rejecting patches for 
platform bugs cause by incorrectly wired up PCIe root complexes.

Regards,
Christian.

>
>> Regards,
>> Christian.
>>
>>>> Unfortunately I don't think we can safely ttm_cached to ttm_write_comnined, we've
>>>> had enough drama with write combine behaviour on all different platforms.
>>>>
>>>> See drm_arch_can_wc_memory in drm_cache.h.
>>>>
>>> Yes this really sounds like an issue.
>>>
>>> Maybe the behavior of ttm_write_combined should furtherly be decided
>>> by drm_arch_can_wc_memory() in case of quirks?
> IMO for DMA mappings, use dma_pgprot at mapping makes more sense :-)
>
> Thanks
> - Jiaxun
>>>> Thanks
>>>>
>>>>> +
>>>>>    	return ttm_prot_from_caching(caching, tmp);
>>>>>    }
>>>>>    EXPORT_SYMBOL(ttm_io_prot);
>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
>>>>> index 7b00ddf0ce49f..3335df45fba5e 100644
>>>>> --- a/drivers/gpu/drm/ttm/ttm_tt.c
>>>>> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
>>>>> @@ -152,6 +152,10 @@ static void ttm_tt_init_fields(struct ttm_tt *ttm,
>>>>>    			       enum ttm_caching caching,
>>>>>    			       unsigned long extra_pages)
>>>>>    {
>>>>> +	/* Downgrade cached mapping for non-snooping devices */
>>>>> +	if (!bo->bdev->dma_coherent && caching == ttm_cached)
>>>>> +		caching = ttm_write_combined;
>>>>> +
>>>>>    	ttm->num_pages = (PAGE_ALIGN(bo->base.size) >> PAGE_SHIFT) + extra_pages;
>>>>>    	ttm->page_flags = page_flags;
>>>>>    	ttm->dma_address = NULL;
>>>>> diff --git a/include/drm/ttm/ttm_caching.h b/include/drm/ttm/ttm_caching.h
>>>>> index a18f43e93abab..f92d7911f50e4 100644
>>>>> --- a/include/drm/ttm/ttm_caching.h
>>>>> +++ b/include/drm/ttm/ttm_caching.h
>>>>> @@ -47,7 +47,8 @@ enum ttm_caching {
>>>>>
>>>>>    	/**
>>>>>    	 * @ttm_cached: Fully cached like normal system memory, requires that
>>>>> -	 * devices snoop the CPU cache on accesses.
>>>>> +	 * devices snoop the CPU cache on accesses. Downgraded to
>>>>> +	 * ttm_write_combined when the snooping capaiblity is missing.
>>>>>    	 */
>>>>>    	ttm_cached
>>>>>    };
>>>>> -- 
>>>>> 2.45.2


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ