linux-kernel - Re: [PATCH v1 01/10] powerpc/pseries/iommu: Replace hard-coded page shift

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aaaf993a-d233-f5be-b809-5911a6a9872d@ozlabs.ru>
Date:   Mon, 31 Aug 2020 10:06:47 +1000
From:   Alexey Kardashevskiy <aik@...abs.ru>
To:     Leonardo Bras <leobras.c@...il.com>,
        Michael Ellerman <mpe@...erman.id.au>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Paul Mackerras <paulus@...ba.org>,
        Christophe Leroy <christophe.leroy@....fr>,
        Joel Stanley <joel@....id.au>,
        Thiago Jung Bauermann <bauerman@...ux.ibm.com>,
        Ram Pai <linuxram@...ibm.com>,
        Brian King <brking@...ux.vnet.ibm.com>,
        Murilo Fossa Vicentini <muvic@...ux.ibm.com>,
        David Dai <zdai@...ux.vnet.ibm.com>
Cc:     linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 01/10] powerpc/pseries/iommu: Replace hard-coded page
 shift



On 29/08/2020 05:55, Leonardo Bras wrote:
> On Fri, 2020-08-28 at 12:27 +1000, Alexey Kardashevskiy wrote:
>>
>> On 28/08/2020 01:32, Leonardo Bras wrote:
>>> Hello Alexey, thank you for this feedback!
>>>
>>> On Sat, 2020-08-22 at 19:33 +1000, Alexey Kardashevskiy wrote:
>>>>> +#define TCE_RPN_BITS		52		/* Bits 0-51 represent RPN on TCE */
>>>>
>>>> Ditch this one and use MAX_PHYSMEM_BITS instead? I am pretty sure this
>>>> is the actual limit.
>>>
>>> I understand this MAX_PHYSMEM_BITS(51) comes from the maximum physical memory addressable in the machine. IIUC, it means we can access physical address up to (1ul << MAX_PHYSMEM_BITS). 
>>>
>>> This 52 comes from PAPR "Table 9. TCE Definition" which defines bits
>>> 0-51 as the RPN. By looking at code, I understand that it means we may input any address < (1ul << 52) to TCE.
>>>
>>> In practice, MAX_PHYSMEM_BITS should be enough as of today, because I suppose we can't ever pass a physical page address over 
>>> (1ul << 51), and TCE accepts up to (1ul << 52).
>>> But if we ever increase MAX_PHYSMEM_BITS, it doesn't necessarily means that TCE_RPN_BITS will also be increased, so I think they are independent values. 
>>>
>>> Does it make sense? Please let me know if I am missing something.
>>
>> The underlying hardware is PHB3/4 about which the IODA2 Version 2.4
>> 6Apr2012.pdf spec says:
>>
>> "The number of most significant RPN bits implemented in the TCE is
>> dependent on the max size of System Memory to be supported by the platform".
>>
>> IODA3 is the same on this matter.
>>
>> This is MAX_PHYSMEM_BITS and PHB itself does not have any other limits
>> on top of that. So the only real limit comes from MAX_PHYSMEM_BITS and
>> where TCE_RPN_BITS comes from exactly - I have no idea.
> 
> Well, I created this TCE_RPN_BITS = 52 because the previous mask was a
> hardcoded 40-bit mask (0xfffffffffful), for hard-coded 12-bit (4k)
> pagesize, and on PAPR+/LoPAR also defines TCE as having bits 0-51
> described as RPN, as described before.
> 
> IODA3 Revision 3.0_prd1 (OpenPowerFoundation), Figure 3.4 and 3.5.
> shows system memory mapping into a TCE, and the TCE also has bits 0-51
> for the RPN (52 bits). "Table 3.6. TCE Definition" also shows it.
>> In fact, by the looks of those figures, the RPN_MASK should always be a
> 52-bit mask, and RPN = (page >> tceshift) & RPN_MASK.


I suspect the mask is there in the first place for extra protection
against too big addresses going to the TCE table (or/and for virtial vs
physical addresses). Using 52bit mask makes no sense for anything, you
could just drop the mask and let c compiler deal with 64bit "uint" as it
is basically a 4K page address anywhere in the 64bit space. Thanks,


> Maybe that's it?




> 
>>
>>
>>>>
>>>>> +#define TCE_RPN_MASK(ps)	((1ul << (TCE_RPN_BITS - (ps))) - 1)
>>>>>  #define TCE_VALID		0x800		/* TCE valid */
>>>>>  #define TCE_ALLIO		0x400		/* TCE valid for all lpars */
>>>>>  #define TCE_PCI_WRITE		0x2		/* write from PCI allowed */
>>>>> diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
>>>>> index e4198700ed1a..8fe23b7dff3a 100644
>>>>> --- a/arch/powerpc/platforms/pseries/iommu.c
>>>>> +++ b/arch/powerpc/platforms/pseries/iommu.c
>>>>> @@ -107,6 +107,9 @@ static int tce_build_pSeries(struct iommu_table *tbl, long index,
>>>>>  	u64 proto_tce;
>>>>>  	__be64 *tcep;
>>>>>  	u64 rpn;
>>>>> +	const unsigned long tceshift = tbl->it_page_shift;
>>>>> +	const unsigned long pagesize = IOMMU_PAGE_SIZE(tbl);
>>>>> +	const u64 rpn_mask = TCE_RPN_MASK(tceshift);
>>>>
>>>> Using IOMMU_PAGE_SIZE macro for the page size and not using
>>>> IOMMU_PAGE_MASK for the mask - this incosistency makes my small brain
>>>> explode :) I understand the history but maaaaan... Oh well, ok.
>>>>
>>>
>>> Yeah, it feels kind of weird after two IOMMU related consts. :)
>>> But sure IOMMU_PAGE_MASK() would not be useful here :)
>>>
>>> And this kind of let me thinking:
>>>>> +		rpn = __pa(uaddr) >> tceshift;
>>>>> +		*tcep = cpu_to_be64(proto_tce | (rpn & rpn_mask) << tceshift);
>>> Why not:
>>> 	rpn_mask =  TCE_RPN_MASK(tceshift) << tceshift;
>>
>> A mask for a page number (but not the address!) hurts my brain, masks
>> are good against addresses but numbers should already have all bits
>> adjusted imho, may be it is just me :-/
>>
>>
>>> 	
>>> 	rpn = __pa(uaddr) & rpn_mask;
>>> 	*tcep = cpu_to_be64(proto_tce | rpn)
>>>
>>> I am usually afraid of changing stuff like this, but I think it's safe.
>>>
>>>> Good, otherwise. Thanks,
>>>
>>> Thank you for reviewing!
>>>  
>>>
>>>
> 

-- 
Alexey