lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 15 Aug 2020 11:13:36 +0200
From:   Ingo Molnar <mingo@...nel.org>
To:     Yazen Ghannam <Yazen.Ghannam@....com>
Cc:     linux-edac@...r.kernel.org, linux-kernel@...r.kernel.org,
        bp@...e.de, tony.luck@...el.com, x86@...nel.org,
        Smita.KoralahalliChannabasappa@....com
Subject: Re: [PATCH 2/2] x86/MCE/AMD Support new memory interleaving schemes
 during address translation


* Yazen Ghannam <Yazen.Ghannam@....com> wrote:

> +     /* Read D18F1x208 (System Fabric ID Mask 0). */
> +     if (amd_df_indirect_read(nid, 1, 0x208, umc, &tmp))
> +             goto out_err;
> +
> +     /* Determine if system is a legacy Data Fabric type. */
> +     legacy_df = !(tmp & 0xFF);

1)

I see this pattern in a lot of places in the code, first the magic 
constant 0x208 is explained a comment, then it is *repeated* and used 
it in the code...

How about introducing an obviously named enum for it instead, which 
would then be self-documenting, saving the comment and removing magic 
numbers:

	if (amd_df_indirect_read(nid, 1, AMD_REG_FAB_ID, umc, &reg_fab_id))
		goto out_err;

(The symbolic name should be something better, I just guessed 
something quickly.)

Please clean this up in a separate patch, not part of the already 
large patch that introduces a new feature.

2)

'tmp & 0xFF' is some sort of fabric version ID value, with a value of 
0 denoting legacy (pre-Rome) systems, right?

How about making that explicit:

	df_version = reg_fab_id & 0xFF;

I'm pretty sure such a version ID might come handy later on, should 
there be quirks or new capabilities with the newer systems ...


>  			ret_addr -= hi_addr_offset;
> @@ -728,23 +740,31 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr)
>  	}
>  
>  	lgcy_mmio_hole_en = tmp & BIT(1);
> -	intlv_num_chan	  = (tmp >> 4) & 0xF;
> -	intlv_addr_sel	  = (tmp >> 8) & 0x7;
> -	dram_base_addr	  = (tmp & GENMASK_ULL(31, 12)) << 16;
>  
> -	/* {0, 1, 2, 3} map to address bits {8, 9, 10, 11} respectively */
> -	if (intlv_addr_sel > 3) {
> -		pr_err("%s: Invalid interleave address select %d.\n",
> -			__func__, intlv_addr_sel);
> -		goto out_err;
> +	if (legacy_df) {
> +		intlv_num_chan	  = (tmp >> 4) & 0xF;
> +		intlv_addr_sel	  = (tmp >> 8) & 0x7;
> +	} else {
> +		intlv_num_chan    = (tmp >> 2) & 0xF;
> +		intlv_num_dies	  = (tmp >> 6) & 0x3;
> +		intlv_num_sockets = (tmp >> 8) & 0x1;
> +		intlv_addr_sel	  = (tmp >> 9) & 0x7;
>  	}
>  
> +	dram_base_addr	  = (tmp & GENMASK_ULL(31, 12)) << 16;
> +
>  	/* Read D18F0x114 (DramLimitAddress). */
>  	if (amd_df_indirect_read(nid, 0, 0x114 + (8 * base), umc, &tmp))
>  		goto out_err;
>  
> -	intlv_num_sockets = (tmp >> 8) & 0x1;
> -	intlv_num_dies	  = (tmp >> 10) & 0x3;
> +	if (legacy_df) {
> +		intlv_num_sockets = (tmp >> 8) & 0x1;
> +		intlv_num_dies	  = (tmp >> 10) & 0x3;
> +		dst_fabric_id	  = tmp & 0xFF;
> +	} else {
> +		dst_fabric_id	  = tmp & 0x3FF;
> +	}
> +
>  	dram_limit_addr	  = ((tmp & GENMASK_ULL(31, 12)) << 16) | GENMASK_ULL(27, 0);

Could we please structure this code in a bit more readable fashion?

1)

Such as not using the meaningless 'tmp' variable name to first read 
out DramOffset, then DramLimitAddress?

How about naming them a bit more obviously, and retrieving them in a 
single step:

        if (amd_df_indirect_read(nid, 0, 0x1B4, umc, &reg_dram_off))
                goto out_err;

        /* Remove HiAddrOffset from normalized address, if enabled: */
        if (reg_dram_off & BIT(0)) {
                u64 hi_addr_offset = (tmp & GENMASK_ULL(31, 20)) << 8;

                if (norm_addr >= hi_addr_offset) {
                        ret_addr -= hi_addr_offset;
                        base = 1;
                }
        }

        if (amd_df_indirect_read(nid, 0, 0x114 + (8 * base), umc, &reg_dram_lim))
                goto out_err;

('reg' stands for register value - but 'val' would work too.)

Side note: why is the above code using BIT() and GENMASK_UUL() when 
all the other and new code is using fixed masks? Use one of these 
versions instead of a weird mix ...

2)

Then all the fabric version dependent logic could be consolidated 
instead of being spread out:

	if (df_version) {
		intlv_num_chan    = (reg_dram_off >>  2) & 0xF;
		intlv_num_dies    = (reg_dram_off >>  6) & 0x3;
		intlv_num_sockets = (reg_dram_off >>  8) & 0x1;
		intlv_addr_sel    = (reg_dram_off >>  9) & 0x7;

		dst_fabric_id     = (reg_dram_lim >>  0) & 0x3FF;
	} else {
		intlv_num_chan    = (reg_dram_off >>  4) & 0xF;
		intlv_num_dies    = (reg_dram_lim >> 10) & 0x3;
		intlv_num_sockets = (reg_dram_lim >>  8) & 0x1;
		intlv_addr_sel    = (reg_dram_off >>  8) & 0x7;

		dst_fabric_id     = (reg_dram_lim >>  0) & 0xFF;
	}

Also note a couple of more formatting & ordering edits I did to the 
code, to improve the structure. My copy & paste job is untested 
though.

3)

Notably, note how the new code on current systems is the first branch 
- that's the most interesting code most of the time anyaway, legacy 
systems being legacy.

BTW., please do any such suggested code flow and structure clean-up 
patches first in the series, then introduce the new logic, to make it 
easier to verify.

Thanks,

	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ