lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <78bc10b0-9376-0d21-4d66-0099376666bf@oracle.com>
Date:   Mon, 15 May 2023 16:19:11 -0400
From:   Ross Philipson <ross.philipson@...cle.com>
To:     Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org,
        x86@...nel.org, linux-integrity@...r.kernel.org,
        linux-doc@...r.kernel.org, linux-crypto@...r.kernel.org,
        iommu@...ts.linux-foundation.org, kexec@...ts.infradead.org,
        linux-efi@...r.kernel.org
Cc:     dpsmith@...rtussolutions.com, mingo@...hat.com, bp@...en8.de,
        hpa@...or.com, ardb@...nel.org, mjg59@...f.ucam.org,
        James.Bottomley@...senpartnership.com, luto@...capital.net,
        nivedita@...m.mit.edu, kanth.ghatraju@...cle.com,
        trenchboot-devel@...glegroups.com,
        Ross Philipson <ross.philipson@...cle.com>
Subject: Re: [PATCH v6 09/14] x86: Secure Launch SMP bringup support

On 5/12/23 14:02, Thomas Gleixner wrote:
> On Thu, May 04 2023 at 14:50, Ross Philipson wrote:
>>   
>> +#ifdef CONFIG_SECURE_LAUNCH
>> +
>> +static atomic_t first_ap_only = {1};
> 
> ATOMIC_INIT(1) if at all.
> 
>> +
>> +/*
>> + * Called to fix the long jump address for the waiting APs to vector to
>> + * the correct startup location in the Secure Launch stub in the rmpiggy.
>> + */
>> +static int
>> +slaunch_fixup_jump_vector(void)
> 
> One line please.
> 
>> +{
>> +	struct sl_ap_wake_info *ap_wake_info;
>> +	u32 *ap_jmp_ptr = NULL;
>> +
>> +	if (!atomic_dec_and_test(&first_ap_only))
>> +		return 0;
> 
> Why does this need an atomic? CPU bringup is fully serialized and even
> with the upcoming parallel bootup work, there is no concurrency on this
> function.
> 
> Aside of that. Why isn't this initialized during boot in a __init function?
> 
>> +	ap_wake_info = slaunch_get_ap_wake_info();
>> +
>> +	ap_jmp_ptr = (u32 *)__va(ap_wake_info->ap_wake_block +
>> +				 ap_wake_info->ap_jmp_offset);
>> +
>> +	*ap_jmp_ptr = real_mode_header->sl_trampoline_start32;
>> +
>> +	pr_debug("TXT AP long jump address updated\n");
>> +
>> +	return 0;
> 
> Why does this need a return code of all return paths return 0?
> 
>> +}
>> +
>> +/*
>> + * TXT AP startup is quite different than normal. The APs cannot have #INIT
>> + * asserted on them or receive SIPIs. The early Secure Launch code has parked
>> + * the APs in a pause loop waiting to receive an NMI. This will wake the APs
>> + * and have them jump to the protected mode code in the rmpiggy where the rest
>> + * of the SMP boot of the AP will proceed normally.
>> + */
>> +static int
>> +slaunch_wakeup_cpu_from_txt(int cpu, int apicid)
>> +{
>> +	unsigned long send_status = 0, accept_status = 0;
>> +
>> +	/* Only done once */
> 
> Yes. But not here.
> 
>> +	if (slaunch_fixup_jump_vector())
>> +		return -1;
>> +
>> +	/* Send NMI IPI to idling AP and wake it up */
>> +	apic_icr_write(APIC_DM_NMI, apicid);
>> +
>> +	if (init_udelay == 0)
>> +		udelay(10);
>> +	else
>> +		udelay(300);
> 
> The wonders of copy & pasta. This condition is pointless because this
> code only runs on systems which force init_udelay to 0.
> 
>> +	send_status = safe_apic_wait_icr_idle();
> 
> Moar copy & pasta. As this is guaranteed to be X2APIC mode, this
> function is a nop and returns 0 unconditionally.
> 
>> +	if (init_udelay == 0)
>> +		udelay(10);
>> +	else
>> +		udelay(300);
>> +
>> +	accept_status = (apic_read(APIC_ESR) & 0xEF);
> 
> The point of this is? Bit 0-3 are Pentium and P6 only.
> 
> Bit 4 Tried to send low prio IPI but not supported
> Bit 5 Illegal Vector sent
> Bit 6 Illegal Vector received
> Bit 7 X2APIC illegal register access
> 
> IOW, there is no accept error here. That would be bit 2 which is never set
> on anything modern
> 
> But aside of that the read is moot anyway because the CPU has the APIC
> error vector enabled so if this would happen the APIC error interrupt
> would have swallowed and cleared the error condition.
> 
> IOW. Everything except the apic_icr_write() here is completely useless.
> 
>> +#else
>> +
>> +#define slaunch_wakeup_cpu_from_txt(cpu, apicid)	0
> 
> inline stub please.
> 
>> +
>> +#endif  /* !CONFIG_SECURE_LAUNCH */
>> +
>>   /*
>>    * NOTE - on most systems this is a PHYSICAL apic ID, but on multiquad
>>    * (ie clustered apic addressing mode), this is a LOGICAL apic ID.
>> @@ -1132,6 +1210,13 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle,
>>   	cpumask_clear_cpu(cpu, cpu_initialized_mask);
>>   	smp_mb();
>>   
>> +	/* With Intel TXT, the AP startup is totally different */
>> +	if ((slaunch_get_flags() & (SL_FLAG_ACTIVE|SL_FLAG_ARCH_TXT)) ==
>> +	   (SL_FLAG_ACTIVE|SL_FLAG_ARCH_TXT)) {
> 
> Stick this condition into a helper function please
> 
>> +		boot_error = slaunch_wakeup_cpu_from_txt(cpu, apicid);
>> +		goto txt_wake;
>> +	}
>> +
>>   	/*
>>   	 * Wake up a CPU in difference cases:
>>   	 * - Use a method from the APIC driver if one defined, with wakeup
>> @@ -1147,6 +1232,7 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle,
>>   		boot_error = wakeup_cpu_via_init_nmi(cpu, start_ip, apicid,
>>   						     cpu0_nmi_registered);
>>   
>> +txt_wake:
> 
> Sorry, but what has this to do with TXT ? And why can't the above just
> be yet another if clause in the existing if/else if maze?
> 
> Now that brings me to another question. How is this supposed to work
> with CPU hotplug post boot?
> 
> It will simply not work at all because once a CPU is offlined it is
> going to sit in an endless loop and wait for INIT/SIPI/SIPI. So it will
> get that NMI and go back to wait.
> 
> So you need a TXT specific cpu_play_dead() implementation, which should
> preferrably use monitor/mwait where each "offline" CPU sits and waits
> until a condition becomes true. Then you don't need a NMI for wakeup at
> all. Just writing the condition into that per CPU cache line should be
> enough.
> 
> Thanks,
> 
>          tglx
> 
There is a lot here to think about. It sounds like you are suggesting we 
design all of this differently and we can definitely do that. We need 
time to go over this and your parallel startup series before we can 
really get down to how best to approach this.

I am going on vacation and will be back the first week of June. I will 
get back to you then once I have had time to go over all of this and 
your other patches.

Thank you for all your responses.

Ross

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ