[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <874ka06d3z.ffs@tglx>
Date: Fri, 01 Oct 2021 16:20:16 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: "Chang S. Bae" <chang.seok.bae@...el.com>, bp@...e.de,
luto@...nel.org, mingo@...nel.org, x86@...nel.org
Cc: len.brown@...el.com, lenb@...nel.org, dave.hansen@...el.com,
thiago.macieira@...el.com, jing2.liu@...el.com,
ravi.v.shankar@...el.com, linux-kernel@...r.kernel.org,
chang.seok.bae@...el.com
Subject: Re: [PATCH v10 09/28] x86/fpu/xstate: Introduce helpers to manage
the XSTATE buffer dynamically
On Wed, Aug 25 2021 at 08:53, Chang S. Bae wrote:
> +/**
> + * calculate_xstate_buf_size_from_mask - Calculate an xstate buffer size
> + * @mask: A bitmap to tell which components to be saved in the buffer.
> + *
> + * Available once those arrays for the offset, size, and alignment info are
> + * set up, by setup_xstate_features().
> + *
> + * Returns: The buffer size
> + */
> +unsigned int calculate_xstate_buf_size_from_mask(u64 mask)
> +{
> + unsigned int size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
> + int i, nr;
> +
> + if (!mask)
> + return 0;
> +
> + /*
> + * The minimum buffer size excludes the dynamic user state. When a
> + * task uses the state, the buffer can grow up to the max size.
> + */
> + if (mask == (xfeatures_mask_all & ~xfeatures_mask_user_dynamic))
> + return fpu_buf_cfg.min_size;
> + else if (mask == xfeatures_mask_all)
> + return fpu_buf_cfg.max_size;
> +
> + nr = fls64(mask) - 1;
> + if (nr < FIRST_EXTENDED_XFEATURE)
> + return size;
> +
> + /*
> + * Each state offset in the non-compacted format is fixed. Take the
> + * size from the last feature 'nr'.
> + */
> + if (!cpu_feature_enabled(X86_FEATURE_XSAVES))
> + return xstate_offsets[nr] + xstate_sizes[nr];
> +
> + /*
> + * With the given mask, no relevant size is found so far. So,
> + * calculate it by summing up each state size.
> + */
> + for (i = FIRST_EXTENDED_XFEATURE; i <= nr; i++) {
> + if (!(mask & BIT_ULL(i)))
> + continue;
> +
> + if (xstate_64byte_aligned[i])
> + size = ALIGN(size, 64);
> + size += xstate_sizes[i];
> + }
> + return size;
> +}
So we have yet another slightly different function to calculate the
buffer size. Why do we still need calculate_xstate_size()?
> +void free_xstate_buffer(struct fpu *fpu)
> +{
Can you please put the check:
> + if (fpu->state != &fpu->__default_state)
into this function? If it is ever called without checking it and state
points at fpu->__default_state then the explosions are going to be
interesting.
> + vfree(fpu->state);
> +}
> +
> +/**
> + * realloc_xstate_buffer - Re-alloc a buffer with the size calculated from
> + * @mask.
> + *
> + * @fpu: A struct fpu * pointer
> + * @mask: The bitmap tells which components to be reserved in the new
> + * buffer.
> + *
> + * It deals with enlarging the xstate buffer with dynamic states.
> + *
> + * Use vzalloc() simply here. If the task with a vzalloc()-allocated buffer
> + * tends to terminate quickly, vfree()-induced IPIs may be a concern.
> + * Caching may be helpful for this. But the task with large state is likely
> + * to live longer.
> + *
> + * Also, this method does not shrink or reclaim the buffer.
> + *
> + * Returns 0 on success, -ENOMEM on allocation error.
> + */
> +int realloc_xstate_buffer(struct fpu *fpu, u64 mask)
> +{
> + union fpregs_state *state;
> + u64 state_mask;
> +
> + state_mask = fpu->state_mask | mask;
> + if ((state_mask & fpu->state_mask) == state_mask)
> + return 0;
> +
> + state = vzalloc(calculate_xstate_buf_size_from_mask(state_mask));
> + if (!state)
> + return -ENOMEM;
> +
> + /*
> + * As long as the register state is intact, save the xstate in the
> + * new buffer at the next context switch or ptrace's context
> + * injection.
What exactly guarantees that current's xstate is valid in the hardware
registers? This has to be fully preemptible context, otherwise you could
not invoke vzalloc() which can sleep.
Which in turn means that the placement of the exception fixup in a later
patch is broken:
> DEFINE_IDTENTRY(exc_device_not_available)
> {
> unsigned long cr0 = read_cr0();
>
> + if (handle_xfd_event(¤t->thread.fpu, regs))
> + return;
And no, we are not going to use an atomic allocation for this.
For the other call site from xstateregs_set() the FPU register state is
definitely not live because the task is stopped and the FPU registers
if live belong to the ptracer.
So you really want something like this:
static struct fpregs_state *swap_fpstate(fpu, newstate, mask, size)
{
old_state = fpu->state;
fpu->state = newstate;
fpu->state_mask = mask;
fpu->state_size = size;
return old_state != fpu->__default_state ? old_state : NULL;
}
int realloc_xstate_buffer(struct task_struct *tsk, u64 mask)
{
old_state;
fpu = tsk->fpu;
size = calc_size(state_mask);
state = vzalloc(size);
if (!state)
return -ENOMEM;
if (cpu_feature_enabled(X86_FEATURE_XSAVES))
fpstate_init_xstate(&state->xsave, state_mask);
if (tsk != current) {
// PTRACE ....
old_state = swap_fpstate(fpu, state, state_mask, size);
} else {
fpregs_lock();
if (!registers_valid())
copy_state(state, fpu->state);
old_state = swap_fpstate(fpu, state, state_mask, size);
fpregs_unlock();
}
vfree(old_state);
return 0;
}
And the exception fixup has to move into the irq enabled region. I come
to that patch later.
> @@ -1147,6 +1268,8 @@ static int copy_uabi_to_xstate(struct fpu *fpu, const void *kbuf,
> if (validate_user_xstate_header(&hdr))
> return -EINVAL;
>
> + hdr.xfeatures &= fpu->state_mask;
This is really the wrong patch doing this. This belongs into the one
which introduces fpu->state_mask.
Thanks,
tglx
Powered by blists - more mailing lists