[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2eeef3e6-2f7c-82ed-f02b-acd49a47b527@suse.com>
Date: Tue, 15 Dec 2020 16:07:29 +0100
From: Jürgen Groß <jgross@...e.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Juri Lelli <juri.lelli@...hat.com>, linux-hyperv@...r.kernel.org,
Daniel Lezcano <daniel.lezcano@...aro.org>,
Wanpeng Li <wanpengli@...cent.com>, kvm@...r.kernel.org,
"VMware, Inc." <pv-drivers@...are.com>,
virtualization@...ts.linux-foundation.org,
Ben Segall <bsegall@...gle.com>,
"H. Peter Anvin" <hpa@...or.com>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Wei Liu <wei.liu@...nel.org>,
Stefano Stabellini <sstabellini@...nel.org>,
Stephen Hemminger <sthemmin@...rosoft.com>,
Joerg Roedel <joro@...tes.org>, x86@...nel.org,
Ingo Molnar <mingo@...hat.com>, Mel Gorman <mgorman@...e.de>,
xen-devel@...ts.xenproject.org,
Haiyang Zhang <haiyangz@...rosoft.com>,
Steven Rostedt <rostedt@...dmis.org>,
Borislav Petkov <bp@...en8.de>, luto@...nel.org,
Josh Poimboeuf <jpoimboe@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Thomas Gleixner <tglx@...utronix.de>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Jim Mattson <jmattson@...gle.com>,
linux-kernel@...r.kernel.org,
Sean Christopherson <sean.j.christopherson@...el.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Daniel Bristot de Oliveira <bristot@...hat.com>
Subject: Re: [PATCH v2 00/12] x86: major paravirt cleanup
On 15.12.20 15:54, Peter Zijlstra wrote:
> On Tue, Dec 15, 2020 at 03:18:34PM +0100, Peter Zijlstra wrote:
>> Ah, I was waiting for Josh to have an opinion (and then sorta forgot
>> about the whole thing again). Let me refresh and provide at least a
>> Changelog.
>
> How's this then?
Thanks, will add it into my series.
Juergen
>
> ---
> Subject: objtool: Alternatives vs ORC, the hard way
> From: Peter Zijlstra <peterz@...radead.org>
> Date: Mon, 23 Nov 2020 14:43:17 +0100
>
> Alternatives pose an interesting problem for unwinders because from
> the unwinders PoV we're just executing instructions, it has no idea
> the text is modified, nor any way of retrieving what with.
>
> Therefore the stance has been that alternatives must not change stack
> state, as encoded by commit: 7117f16bf460 ("objtool: Fix ORC vs
> alternatives"). This obviously guarantees that whatever actual
> instructions end up in the text, the unwind information is correct.
>
> However, there is one additional source of text patching that isn't
> currently visible to objtool: paravirt immediate patching. And it
> turns out one of these violates the rule.
>
> As part of cleaning that up, the unfortunate reality is that objtool
> now has to deal with alternatives modifying unwind state and validate
> the combination is valid and generate ORC data to match.
>
> The problem is that a single instance of unwind information (ORC) must
> capture and correctly unwind all alternatives. Since the trivially
> correct mandate is out, implement the straight forward brute-force
> approach:
>
> 1) generate CFI information for each alternative
>
> 2) unwind every alternative with the merge-sort of the previously
> generated CFI information -- O(n^2)
>
> 3) for any possible conflict: yell.
>
> 4) Generate ORC with merge-sort
>
> Specifically for 3 there are two possible classes of conflicts:
>
> - the merge-sort itself could find conflicting CFI for the same
> offset.
>
> - the unwind can fail with the merged CFI.
>
> In specific, this allows us to deal with:
>
> Alt1 Alt2 Alt3
>
> 0x00 CALL *pv_ops.save_fl CALL xen_save_fl PUSHF
> 0x01 POP %RAX
> 0x02 NOP
> ...
> 0x05 NOP
> ...
> 0x07 <insn>
>
> The unwind information for offset-0x00 is identical for all 3
> alternatives. Similarly offset-0x05 and higher also are identical (and
> the same as 0x00). However offset-0x01 has deviating CFI, but that is
> only relevant for Alt3, neither of the other alternative instruction
> streams will ever hit that offset.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> ---
> tools/objtool/check.c | 180 ++++++++++++++++++++++++++++++++++++++++++++----
> tools/objtool/check.h | 5 +
> tools/objtool/orc_gen.c | 180 +++++++++++++++++++++++++++++++-----------------
> 3 files changed, 290 insertions(+), 75 deletions(-)
>
> --- a/tools/objtool/check.c
> +++ b/tools/objtool/check.c
> @@ -1096,6 +1096,32 @@ static int handle_group_alt(struct objto
> return -1;
> }
>
> + /*
> + * Add the filler NOP, required for alternative CFI.
> + */
> + if (special_alt->group && special_alt->new_len < special_alt->orig_len) {
> + struct instruction *nop = malloc(sizeof(*nop));
> + if (!nop) {
> + WARN("malloc failed");
> + return -1;
> + }
> + memset(nop, 0, sizeof(*nop));
> + INIT_LIST_HEAD(&nop->alts);
> + INIT_LIST_HEAD(&nop->stack_ops);
> + init_cfi_state(&nop->cfi);
> +
> + nop->sec = last_new_insn->sec;
> + nop->ignore = last_new_insn->ignore;
> + nop->func = last_new_insn->func;
> + nop->alt_group = alt_group;
> + nop->offset = last_new_insn->offset + last_new_insn->len;
> + nop->type = INSN_NOP;
> + nop->len = special_alt->orig_len - special_alt->new_len;
> +
> + list_add(&nop->list, &last_new_insn->list);
> + last_new_insn = nop;
> + }
> +
> if (fake_jump)
> list_add(&fake_jump->list, &last_new_insn->list);
>
> @@ -2237,18 +2263,12 @@ static int handle_insn_ops(struct instru
> struct stack_op *op;
>
> list_for_each_entry(op, &insn->stack_ops, list) {
> - struct cfi_state old_cfi = state->cfi;
> int res;
>
> res = update_cfi_state(insn, &state->cfi, op);
> if (res)
> return res;
>
> - if (insn->alt_group && memcmp(&state->cfi, &old_cfi, sizeof(struct cfi_state))) {
> - WARN_FUNC("alternative modifies stack", insn->sec, insn->offset);
> - return -1;
> - }
> -
> if (op->dest.type == OP_DEST_PUSHF) {
> if (!state->uaccess_stack) {
> state->uaccess_stack = 1;
> @@ -2460,19 +2480,137 @@ static int validate_return(struct symbol
> * unreported (because they're NOPs), such holes would result in CFI_UNDEFINED
> * states which then results in ORC entries, which we just said we didn't want.
> *
> - * Avoid them by copying the CFI entry of the first instruction into the whole
> - * alternative.
> + * Avoid them by copying the CFI entry of the first instruction into the hole.
> */
> -static void fill_alternative_cfi(struct objtool_file *file, struct instruction *insn)
> +static void __fill_alt_cfi(struct objtool_file *file, struct instruction *insn)
> {
> struct instruction *first_insn = insn;
> int alt_group = insn->alt_group;
>
> - sec_for_each_insn_continue(file, insn) {
> + sec_for_each_insn_from(file, insn) {
> if (insn->alt_group != alt_group)
> break;
> - insn->cfi = first_insn->cfi;
> +
> + if (!insn->visited)
> + insn->cfi = first_insn->cfi;
> + }
> +}
> +
> +static void fill_alt_cfi(struct objtool_file *file, struct instruction *alt_insn)
> +{
> + struct alternative *alt;
> +
> + __fill_alt_cfi(file, alt_insn);
> +
> + list_for_each_entry(alt, &alt_insn->alts, list)
> + __fill_alt_cfi(file, alt->insn);
> +}
> +
> +static struct instruction *
> +__find_unwind(struct objtool_file *file,
> + struct instruction *insn, unsigned long offset)
> +{
> + int alt_group = insn->alt_group;
> + struct instruction *next;
> + unsigned long off = 0;
> +
> + while ((off + insn->len) <= offset) {
> + next = next_insn_same_sec(file, insn);
> + if (next && next->alt_group != alt_group)
> + next = NULL;
> +
> + if (!next)
> + break;
> +
> + off += insn->len;
> + insn = next;
> }
> +
> + return insn;
> +}
> +
> +struct instruction *
> +find_alt_unwind(struct objtool_file *file,
> + struct instruction *alt_insn, unsigned long offset)
> +{
> + struct instruction *fit;
> + struct alternative *alt;
> + unsigned long fit_off;
> +
> + fit = __find_unwind(file, alt_insn, offset);
> + fit_off = (fit->offset - alt_insn->offset);
> +
> + list_for_each_entry(alt, &alt_insn->alts, list) {
> + struct instruction *x;
> + unsigned long x_off;
> +
> + x = __find_unwind(file, alt->insn, offset);
> + x_off = (x->offset - alt->insn->offset);
> +
> + if (fit_off < x_off) {
> + fit = x;
> + fit_off = x_off;
> +
> + } else if (fit_off == x_off &&
> + memcmp(&fit->cfi, &x->cfi, sizeof(struct cfi_state))) {
> +
> + char *_str1 = offstr(fit->sec, fit->offset);
> + char *_str2 = offstr(x->sec, x->offset);
> + WARN("%s: equal-offset incompatible alternative: %s\n", _str1, _str2);
> + free(_str1);
> + free(_str2);
> + return fit;
> + }
> + }
> +
> + return fit;
> +}
> +
> +static int __validate_unwind(struct objtool_file *file,
> + struct instruction *alt_insn,
> + struct instruction *insn)
> +{
> + int alt_group = insn->alt_group;
> + struct instruction *unwind;
> + unsigned long offset = 0;
> +
> + sec_for_each_insn_from(file, insn) {
> + if (insn->alt_group != alt_group)
> + break;
> +
> + unwind = find_alt_unwind(file, alt_insn, offset);
> +
> + if (memcmp(&insn->cfi, &unwind->cfi, sizeof(struct cfi_state))) {
> +
> + char *_str1 = offstr(insn->sec, insn->offset);
> + char *_str2 = offstr(unwind->sec, unwind->offset);
> + WARN("%s: unwind incompatible alternative: %s (%ld)\n",
> + _str1, _str2, offset);
> + free(_str1);
> + free(_str2);
> + return 1;
> + }
> +
> + offset += insn->len;
> + }
> +
> + return 0;
> +}
> +
> +static int validate_alt_unwind(struct objtool_file *file,
> + struct instruction *alt_insn)
> +{
> + struct alternative *alt;
> +
> + if (__validate_unwind(file, alt_insn, alt_insn))
> + return 1;
> +
> + list_for_each_entry(alt, &alt_insn->alts, list) {
> + if (__validate_unwind(file, alt_insn, alt->insn))
> + return 1;
> + }
> +
> + return 0;
> }
>
> /*
> @@ -2484,9 +2622,10 @@ static void fill_alternative_cfi(struct
> static int validate_branch(struct objtool_file *file, struct symbol *func,
> struct instruction *insn, struct insn_state state)
> {
> + struct instruction *next_insn, *alt_insn = NULL;
> struct alternative *alt;
> - struct instruction *next_insn;
> struct section *sec;
> + int alt_group = 0;
> u8 visited;
> int ret;
>
> @@ -2541,8 +2680,10 @@ static int validate_branch(struct objtoo
> }
> }
>
> - if (insn->alt_group)
> - fill_alternative_cfi(file, insn);
> + if (insn->alt_group) {
> + alt_insn = insn;
> + alt_group = insn->alt_group;
> + }
>
> if (skip_orig)
> return 0;
> @@ -2697,6 +2838,17 @@ static int validate_branch(struct objtoo
> }
>
> insn = next_insn;
> +
> + if (alt_insn && insn->alt_group != alt_group) {
> + alt_insn->alt_end = insn;
> +
> + fill_alt_cfi(file, alt_insn);
> +
> + if (validate_alt_unwind(file, alt_insn))
> + return 1;
> +
> + alt_insn = NULL;
> + }
> }
>
> return 0;
> --- a/tools/objtool/check.h
> +++ b/tools/objtool/check.h
> @@ -41,6 +41,7 @@ struct instruction {
> struct instruction *first_jump_src;
> struct reloc *jump_table;
> struct list_head alts;
> + struct instruction *alt_end;
> struct symbol *func;
> struct list_head stack_ops;
> struct cfi_state cfi;
> @@ -55,6 +56,10 @@ static inline bool is_static_jump(struct
> insn->type == INSN_JUMP_UNCONDITIONAL;
> }
>
> +struct instruction *
> +find_alt_unwind(struct objtool_file *file,
> + struct instruction *alt_insn, unsigned long offset);
> +
> struct instruction *find_insn(struct objtool_file *file,
> struct section *sec, unsigned long offset);
>
> --- a/tools/objtool/orc_gen.c
> +++ b/tools/objtool/orc_gen.c
> @@ -12,75 +12,86 @@
> #include "check.h"
> #include "warn.h"
>
> -int create_orc(struct objtool_file *file)
> +static int create_orc_insn(struct objtool_file *file, struct instruction *insn)
> {
> - struct instruction *insn;
> + struct orc_entry *orc = &insn->orc;
> + struct cfi_reg *cfa = &insn->cfi.cfa;
> + struct cfi_reg *bp = &insn->cfi.regs[CFI_BP];
> +
> + orc->end = insn->cfi.end;
> +
> + if (cfa->base == CFI_UNDEFINED) {
> + orc->sp_reg = ORC_REG_UNDEFINED;
> + return 0;
> + }
>
> - for_each_insn(file, insn) {
> - struct orc_entry *orc = &insn->orc;
> - struct cfi_reg *cfa = &insn->cfi.cfa;
> - struct cfi_reg *bp = &insn->cfi.regs[CFI_BP];
> + switch (cfa->base) {
> + case CFI_SP:
> + orc->sp_reg = ORC_REG_SP;
> + break;
> + case CFI_SP_INDIRECT:
> + orc->sp_reg = ORC_REG_SP_INDIRECT;
> + break;
> + case CFI_BP:
> + orc->sp_reg = ORC_REG_BP;
> + break;
> + case CFI_BP_INDIRECT:
> + orc->sp_reg = ORC_REG_BP_INDIRECT;
> + break;
> + case CFI_R10:
> + orc->sp_reg = ORC_REG_R10;
> + break;
> + case CFI_R13:
> + orc->sp_reg = ORC_REG_R13;
> + break;
> + case CFI_DI:
> + orc->sp_reg = ORC_REG_DI;
> + break;
> + case CFI_DX:
> + orc->sp_reg = ORC_REG_DX;
> + break;
> + default:
> + WARN_FUNC("unknown CFA base reg %d",
> + insn->sec, insn->offset, cfa->base);
> + return -1;
> + }
>
> - if (!insn->sec->text)
> - continue;
> + switch(bp->base) {
> + case CFI_UNDEFINED:
> + orc->bp_reg = ORC_REG_UNDEFINED;
> + break;
> + case CFI_CFA:
> + orc->bp_reg = ORC_REG_PREV_SP;
> + break;
> + case CFI_BP:
> + orc->bp_reg = ORC_REG_BP;
> + break;
> + default:
> + WARN_FUNC("unknown BP base reg %d",
> + insn->sec, insn->offset, bp->base);
> + return -1;
> + }
>
> - orc->end = insn->cfi.end;
> + orc->sp_offset = cfa->offset;
> + orc->bp_offset = bp->offset;
> + orc->type = insn->cfi.type;
>
> - if (cfa->base == CFI_UNDEFINED) {
> - orc->sp_reg = ORC_REG_UNDEFINED;
> - continue;
> - }
> + return 0;
> +}
>
> - switch (cfa->base) {
> - case CFI_SP:
> - orc->sp_reg = ORC_REG_SP;
> - break;
> - case CFI_SP_INDIRECT:
> - orc->sp_reg = ORC_REG_SP_INDIRECT;
> - break;
> - case CFI_BP:
> - orc->sp_reg = ORC_REG_BP;
> - break;
> - case CFI_BP_INDIRECT:
> - orc->sp_reg = ORC_REG_BP_INDIRECT;
> - break;
> - case CFI_R10:
> - orc->sp_reg = ORC_REG_R10;
> - break;
> - case CFI_R13:
> - orc->sp_reg = ORC_REG_R13;
> - break;
> - case CFI_DI:
> - orc->sp_reg = ORC_REG_DI;
> - break;
> - case CFI_DX:
> - orc->sp_reg = ORC_REG_DX;
> - break;
> - default:
> - WARN_FUNC("unknown CFA base reg %d",
> - insn->sec, insn->offset, cfa->base);
> - return -1;
> - }
> +int create_orc(struct objtool_file *file)
> +{
> + struct instruction *insn;
>
> - switch(bp->base) {
> - case CFI_UNDEFINED:
> - orc->bp_reg = ORC_REG_UNDEFINED;
> - break;
> - case CFI_CFA:
> - orc->bp_reg = ORC_REG_PREV_SP;
> - break;
> - case CFI_BP:
> - orc->bp_reg = ORC_REG_BP;
> - break;
> - default:
> - WARN_FUNC("unknown BP base reg %d",
> - insn->sec, insn->offset, bp->base);
> - return -1;
> - }
> + for_each_insn(file, insn) {
> + int ret;
> +
> + if (!insn->sec->text)
> + continue;
>
> - orc->sp_offset = cfa->offset;
> - orc->bp_offset = bp->offset;
> - orc->type = insn->cfi.type;
> + ret = create_orc_insn(file, insn);
> + if (ret)
> + return ret;
> }
>
> return 0;
> @@ -166,6 +177,28 @@ int create_orc_sections(struct objtool_f
>
> prev_insn = NULL;
> sec_for_each_insn(file, sec, insn) {
> +
> + if (insn->alt_end) {
> + unsigned int offset, alt_len;
> + struct instruction *unwind;
> +
> + alt_len = insn->alt_end->offset - insn->offset;
> + for (offset = 0; offset < alt_len; offset++) {
> + unwind = find_alt_unwind(file, insn, offset);
> + /* XXX: skipped earlier ! */
> + create_orc_insn(file, unwind);
> + if (!prev_insn ||
> + memcmp(&unwind->orc, &prev_insn->orc,
> + sizeof(struct orc_entry))) {
> + idx++;
> +// WARN_FUNC("ORC @ %d/%d", sec, insn->offset+offset, offset, alt_len);
> + }
> + prev_insn = unwind;
> + }
> +
> + insn = insn->alt_end;
> + }
> +
> if (!prev_insn ||
> memcmp(&insn->orc, &prev_insn->orc,
> sizeof(struct orc_entry))) {
> @@ -203,6 +236,31 @@ int create_orc_sections(struct objtool_f
>
> prev_insn = NULL;
> sec_for_each_insn(file, sec, insn) {
> +
> + if (insn->alt_end) {
> + unsigned int offset, alt_len;
> + struct instruction *unwind;
> +
> + alt_len = insn->alt_end->offset - insn->offset;
> + for (offset = 0; offset < alt_len; offset++) {
> + unwind = find_alt_unwind(file, insn, offset);
> + if (!prev_insn ||
> + memcmp(&unwind->orc, &prev_insn->orc,
> + sizeof(struct orc_entry))) {
> +
> + if (create_orc_entry(file->elf, u_sec, ip_relocsec, idx,
> + insn->sec, insn->offset + offset,
> + &unwind->orc))
> + return -1;
> +
> + idx++;
> + }
> + prev_insn = unwind;
> + }
> +
> + insn = insn->alt_end;
> + }
> +
> if (!prev_insn || memcmp(&insn->orc, &prev_insn->orc,
> sizeof(struct orc_entry))) {
>
> _______________________________________________
> Virtualization mailing list
> Virtualization@...ts.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
>
Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3092 bytes)
Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)
Powered by blists - more mailing lists