lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF9A956-5623-4D24-BA3E-AF139C0A7CE6@intel.com>
Date:   Sun, 3 Oct 2021 22:36:54 +0000
From:   "Bae, Chang Seok" <chang.seok.bae@...el.com>
To:     Thomas Gleixner <tglx@...utronix.de>
CC:     "bp@...e.de" <bp@...e.de>, "Lutomirski, Andy" <luto@...nel.org>,
        "mingo@...nel.org" <mingo@...nel.org>,
        "x86@...nel.org" <x86@...nel.org>,
        "Brown, Len" <len.brown@...el.com>,
        "lenb@...nel.org" <lenb@...nel.org>,
        "Hansen, Dave" <dave.hansen@...el.com>,
        "Macieira, Thiago" <thiago.macieira@...el.com>,
        "Liu, Jing2" <jing2.liu@...el.com>,
        "Shankar, Ravi V" <ravi.v.shankar@...el.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v10 09/28] x86/fpu/xstate: Introduce helpers to manage the
 XSTATE buffer dynamically

On Oct 1, 2021, at 07:20, Thomas Gleixner <tglx@...utronix.de> wrote:
> On Wed, Aug 25 2021 at 08:53, Chang S. Bae wrote:
>> +/**
>> + * calculate_xstate_buf_size_from_mask - Calculate an xstate buffer size
>> + * @mask:	A bitmap to tell which components to be saved in the buffer.
>> + *
>> + * Available once those arrays for the offset, size, and alignment info are
>> + * set up, by setup_xstate_features().
>> + *
>> + * Returns:	The buffer size
>> + */
>> +unsigned int calculate_xstate_buf_size_from_mask(u64 mask)
>> +{
>> +	unsigned int size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
>> +	int i, nr;
>> +
>> +	if (!mask)
>> +		return 0;
>> +
>> +	/*
>> +	 * The minimum buffer size excludes the dynamic user state. When a
>> +	 * task uses the state, the buffer can grow up to the max size.
>> +	 */
>> +	if (mask == (xfeatures_mask_all & ~xfeatures_mask_user_dynamic))
>> +		return fpu_buf_cfg.min_size;
>> +	else if (mask == xfeatures_mask_all)
>> +		return fpu_buf_cfg.max_size;
>> +
>> +	nr = fls64(mask) - 1;
>> +	if (nr < FIRST_EXTENDED_XFEATURE)
>> +		return size;
>> +
>> +	/*
>> +	 * Each state offset in the non-compacted format is fixed. Take the
>> +	 * size from the last feature 'nr'.
>> +	 */
>> +	if (!cpu_feature_enabled(X86_FEATURE_XSAVES))
>> +		return xstate_offsets[nr] + xstate_sizes[nr];
>> +
>> +	/*
>> +	 * With the given mask, no relevant size is found so far. So,
>> +	 * calculate it by summing up each state size.
>> +	 */
>> +	for (i = FIRST_EXTENDED_XFEATURE; i <= nr; i++) {
>> +		if (!(mask & BIT_ULL(i)))
>> +			continue;
>> +
>> +		if (xstate_64byte_aligned[i])
>> +			size = ALIGN(size, 64);
>> +		size += xstate_sizes[i];
>> +	}
>> +	return size;
>> +}
> 
> So we have yet another slightly different function to calculate the
> buffer size. Why do we still need calculate_xstate_size()?

do_extra_xstate_size_checks() (renamed as calculate_xstate_size() in this
series) does calculate the XSTATE size. But it refers to CPUID as before
setup_xstate_features() which records offset, size, and alignment info. The
above hunk references the stored values at runtime.

Although I could aggressively try to combine them, I thought not that cleanup
is necessary for this series. But perhaps I may follow-up this as cleanup patch.

>> +void free_xstate_buffer(struct fpu *fpu)
>> +{
> 
> Can you please put the check:
> 
>> +	if (fpu->state != &fpu->__default_state)
> 
> into this function? If it is ever called without checking it and state
> points at fpu->__default_state then the explosions are going to be
> interesting.
> 
>> +	vfree(fpu->state);
>> +}

I wrote that originally. But it has been changed like this, as per Boris’:

    >> +void free_xstate_buffer(struct fpu *fpu)
    >> +{
    >> +	/* Free up only the dynamically-allocated memory. */

    >This belongs above the function along with an explanation when it needs
    >to be called.

    >> +	if (fpu->state != &fpu->__default_state)
    >> +		vfree(fpu->state);
    >> +}


It seems that I missed comments on those call sites. But I would like to take
your point.

>> +/**
>> + * realloc_xstate_buffer - Re-alloc a buffer with the size calculated from
>> + *			   @mask.
>> + *
>> + * @fpu:	A struct fpu * pointer
>> + * @mask:	The bitmap tells which components to be reserved in the new
>> + *		buffer.
>> + *
>> + * It deals with enlarging the xstate buffer with dynamic states.
>> + *
>> + * Use vzalloc() simply here. If the task with a vzalloc()-allocated buffer
>> + * tends to terminate quickly, vfree()-induced IPIs may be a concern.
>> + * Caching may be helpful for this. But the task with large state is likely
>> + * to live longer.
>> + *
>> + * Also, this method does not shrink or reclaim the buffer.
>> + *
>> + * Returns 0 on success, -ENOMEM on allocation error.
>> + */
>> +int realloc_xstate_buffer(struct fpu *fpu, u64 mask)
>> +{
>> +	union fpregs_state *state;
>> +	u64 state_mask;
>> +
>> +	state_mask = fpu->state_mask | mask;
>> +	if ((state_mask & fpu->state_mask) == state_mask)
>> +		return 0;
>> +
>> +	state = vzalloc(calculate_xstate_buf_size_from_mask(state_mask));
>> +	if (!state)
>> +		return -ENOMEM;
>> +
>> +	/*
>> +	 * As long as the register state is intact, save the xstate in the
>> +	 * new buffer at the next context switch or ptrace's context
>> +	 * injection.
> 
> What exactly guarantees that current's xstate is valid in the hardware
> registers? This has to be fully preemptible context, otherwise you could
> not invoke vzalloc() which can sleep.
> 
> Which in turn means that the placement of the exception fixup in a later
> patch is broken:
> 
>> DEFINE_IDTENTRY(exc_device_not_available)
>> {
>> 	unsigned long cr0 = read_cr0();
>> 
>> +	if (handle_xfd_event(&current->thread.fpu, regs))
>> +		return;
> 
> And no, we are not going to use an atomic allocation for this.
> 
> For the other call site from xstateregs_set() the FPU register state is
> definitely not live because the task is stopped and the FPU registers
> if live belong to the ptracer.
> 
> So you really want something like this:
> 
> static struct fpregs_state *swap_fpstate(fpu, newstate, mask, size)
> {
>        old_state = fpu->state;
>        fpu->state = newstate;
>        fpu->state_mask = mask;
>        fpu->state_size = size;
>        return old_state != fpu->__default_state ? old_state : NULL;
> }
> 
> int realloc_xstate_buffer(struct task_struct *tsk, u64 mask)
> {
> 	old_state;
>        fpu = tsk->fpu;
> 
>        size = calc_size(state_mask);
>        state = vzalloc(size);
>        if (!state)
>        	return -ENOMEM;
> 
> 	if (cpu_feature_enabled(X86_FEATURE_XSAVES))
> 		fpstate_init_xstate(&state->xsave, state_mask);
> 
>        if (tsk != current) {
>                // PTRACE ....
>        	old_state = swap_fpstate(fpu, state, state_mask, size);
>        } else {
>                fpregs_lock();
>                if (!registers_valid())
>                      copy_state(state, fpu->state);
>                old_state = swap_fpstate(fpu, state, state_mask, size);
>                fpregs_unlock();
>        }
> 
>        vfree(old_state);
>        return 0;
> }
> 
> And the exception fixup has to move into the irq enabled region. I come
> to that patch later.

Okay, IIUC, (1) vzalloc() can sleep, (2) the call site has to be preemptible,
thus (3) lock register states.

Thanks for the code.

>> @@ -1147,6 +1268,8 @@ static int copy_uabi_to_xstate(struct fpu *fpu, const void *kbuf,
>> 	if (validate_user_xstate_header(&hdr))
>> 		return -EINVAL;
>> 
>> +	hdr.xfeatures &= fpu->state_mask;
> 
> This is really the wrong patch doing this. This belongs into the one
> which introduces fpu->state_mask.

The state_mask field was introduced here:

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index ad5cbf922e30..0cc9f6c5a10c 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -336,6 +336,13 @@ struct fpu {
 	 */
 	unsigned long			avx512_timestamp;
 
+	/*
+	 * @state_mask:
+	 *
+	 * The bitmap represents state components to be saved in ->state.
+	 */
+	u64				state_mask;

But as you suggested in the other email, I’m going to create a new patch that
introduces new mask and state fields in struct fpu and add this.

Thanks,
Chang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ