linux-kernel - Re: [v6 PATCH 12/21] x86/insn: Support both signed 32-bit and 64-bit effective addresses

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 26 Apr 2017 20:33:46 -0700
From:   Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
To:     Borislav Petkov <bp@...e.de>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Andy Lutomirski <luto@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Brian Gerst <brgerst@...il.com>,
        Chris Metcalf <cmetcalf@...lanox.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Huang Rui <ray.huang@....com>, Jiri Slaby <jslaby@...e.cz>,
        Jonathan Corbet <corbet@....net>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Paul Gortmaker <paul.gortmaker@...driver.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Chen Yucong <slaoub@...il.com>,
        Alexandre Julliard <julliard@...ehq.org>,
        Stas Sergeev <stsp@...t.ru>, Fenghua Yu <fenghua.yu@...el.com>,
        "Ravi V. Shankar" <ravi.v.shankar@...el.com>,
        Shuah Khan <shuah@...nel.org>, linux-kernel@...r.kernel.org,
        x86@...nel.org, linux-msdos@...r.kernel.org, wine-devel@...ehq.org,
        Adam Buchbinder <adam.buchbinder@...il.com>,
        Colin Ian King <colin.king@...onical.com>,
        Lorenzo Stoakes <lstoakes@...il.com>,
        Qiaowei Ren <qiaowei.ren@...el.com>,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        Adrian Hunter <adrian.hunter@...el.com>,
        Kees Cook <keescook@...omium.org>,
        Thomas Garnier <thgarnie@...gle.com>,
        Dmitry Vyukov <dvyukov@...gle.com>
Subject: Re: [v6 PATCH 12/21] x86/insn: Support both signed 32-bit and
 64-bit effective addresses

On Tue, 2017-04-25 at 15:51 +0200, Borislav Petkov wrote:
> On Tue, Mar 07, 2017 at 04:32:45PM -0800, Ricardo Neri wrote:
> > The 32-bit and 64-bit address encodings are identical. This means that we
> > can use the same function in both cases. In order to reuse the function for
> > 32-bit address encodings, we must sign-extend our 32-bit signed operands to
> > 64-bit signed variables (only for 64-bit builds). To decide on whether sign
> > extension is needed, we rely on the address size as given by the
> > instruction structure.
> > 
> > Lastly, before computing the linear address, we must truncate our signed
> > 64-bit signed effective address if the address size is 32-bit.
> > 
> > Cc: Dave Hansen <dave.hansen@...ux.intel.com>
> > Cc: Adam Buchbinder <adam.buchbinder@...il.com>
> > Cc: Colin Ian King <colin.king@...onical.com>
> > Cc: Lorenzo Stoakes <lstoakes@...il.com>
> > Cc: Qiaowei Ren <qiaowei.ren@...el.com>
> > Cc: Arnaldo Carvalho de Melo <acme@...hat.com>
> > Cc: Masami Hiramatsu <mhiramat@...nel.org>
> > Cc: Adrian Hunter <adrian.hunter@...el.com>
> > Cc: Kees Cook <keescook@...omium.org>
> > Cc: Thomas Garnier <thgarnie@...gle.com>
> > Cc: Peter Zijlstra <peterz@...radead.org>
> > Cc: Borislav Petkov <bp@...e.de>
> > Cc: Dmitry Vyukov <dvyukov@...gle.com>
> > Cc: Ravi V. Shankar <ravi.v.shankar@...el.com>
> > Cc: x86@...nel.org
> > Signed-off-by: Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
> > ---
> >  arch/x86/lib/insn-eval.c | 44 ++++++++++++++++++++++++++++++++------------
> >  1 file changed, 32 insertions(+), 12 deletions(-)
> > 
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index edb360f..a9a1704 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -559,6 +559,15 @@ int insn_get_reg_offset_sib_index(struct insn *insn, struct pt_regs *regs)
> >  	return get_reg_offset(insn, regs, REG_TYPE_INDEX);
> >  }
> >  
> > +static inline long __to_signed_long(unsigned long val, int long_bytes)
> > +{
> > +#ifdef CONFIG_X86_64
> > +	return long_bytes == 4 ? (long)((int)((val) & 0xffffffff)) : (long)val;
> 
> I don't think this always works as expected:
> 
> ---
> typedef unsigned int u32;
> typedef unsigned long u64;
> 
> int main()
> {
>         u64 v = 0x1ffffffff;
> 
>         printf("v: %ld, 0x%lx, %ld\n", v, v, (long)((int)((v) & 0xffffffff)));
> 
>         return 0;
> }
> --
> ...
> 
> v: 8589934591, 0x1ffffffff, -1
> 
> Now, this should not happen on 32-bit because unsigned long is 32-bit
> there but can that happen on 64-bit?

This is the reason I check the value of long_bytes. If long_bytes is not
4, being the only other possible value 8 (perhaps I need to issue an
error when the value is not any of these values), the cast is simply
(long)val. I modified your test program with:

printf("v: %ld, 0x%lx, %ld, %ld\n", v, v, (long)((int)((v) &
0xffffffff)), (long)v);

and I get:

v: 8589934591, 0x1ffffffff, -1, 8589934591.

Am I missing something?

> 
> > +#else
> > +	return (long)val;
> > +#endif
> > +}
> > +
> >  /*
> >   * return the address being referenced be instruction
> >   * for rm=3 returning the content of the rm reg
> > @@ -567,19 +576,21 @@ int insn_get_reg_offset_sib_index(struct insn *insn, struct pt_regs *regs)
> >  void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
> >  {
> >  	unsigned long linear_addr, seg_base_addr;
> > -	long eff_addr, base, indx;
> > -	int addr_offset, base_offset, indx_offset;
> > +	long eff_addr, base, indx, tmp;
> > +	int addr_offset, base_offset, indx_offset, addr_bytes;
> >  	insn_byte_t sib;
> >  
> >  	insn_get_modrm(insn);
> >  	insn_get_sib(insn);
> >  	sib = insn->sib.value;
> > +	addr_bytes = insn->addr_bytes;
> >  
> >  	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
> >  		addr_offset = get_reg_offset(insn, regs, REG_TYPE_RM);
> >  		if (addr_offset < 0)
> >  			goto out_err;
> > -		eff_addr = regs_get_register(regs, addr_offset);
> > +		tmp = regs_get_register(regs, addr_offset);
> > +		eff_addr = __to_signed_long(tmp, addr_bytes);
> 
> This repeats throughout the function so it begs to be a separate:
> 
> 	get_mem_addr()
> 
> or so.

Yes, the same pattern is used in all places except when using register
operands (ModRM.rm == 11b). I will look into putting it in a function.
> 
> >  		seg_base_addr = insn_get_seg_base(regs, insn, addr_offset,
> >  						  false);
> >  	} else {
> > @@ -591,20 +602,24 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
> >  			 * in the address computation.
> >  			 */
> >  			base_offset = get_reg_offset(insn, regs, REG_TYPE_BASE);
> > -			if (unlikely(base_offset == -EDOM))
> > +			if (unlikely(base_offset == -EDOM)) {
> >  				base = 0;
> > -			else if (unlikely(base_offset < 0))
> > +			} else if (unlikely(base_offset < 0)) {
> >  				goto out_err;
> > -			else
> > -				base = regs_get_register(regs, base_offset);
> > +			} else {
> > +				tmp = regs_get_register(regs, base_offset);
> > +				base = __to_signed_long(tmp, addr_bytes);
> > +			}
> >  
> >  			indx_offset = get_reg_offset(insn, regs, REG_TYPE_INDEX);
> > -			if (unlikely(indx_offset == -EDOM))
> > +			if (unlikely(indx_offset == -EDOM)) {
> >  				indx = 0;
> > -			else if (unlikely(indx_offset < 0))
> > +			} else if (unlikely(indx_offset < 0)) {
> >  				goto out_err;
> > -			else
> > -				indx = regs_get_register(regs, indx_offset);
> > +			} else {
> > +				tmp = regs_get_register(regs, indx_offset);
> > +				indx = __to_signed_long(tmp, addr_bytes);
> > +			}
> >  
> >  			eff_addr = base + indx * (1 << X86_SIB_SCALE(sib));
> >  			seg_base_addr = insn_get_seg_base(regs, insn,
> > @@ -625,13 +640,18 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
> >  			} else if (addr_offset < 0) {
> >  				goto out_err;
> >  			} else {
> > -				eff_addr = regs_get_register(regs, addr_offset);
> > +				tmp = regs_get_register(regs, addr_offset);
> > +				eff_addr = __to_signed_long(tmp, addr_bytes);
> >  			}
> >  			seg_base_addr = insn_get_seg_base(regs, insn,
> >  							  addr_offset, false);
> >  		}
> >  		eff_addr += insn->displacement.value;
> >  	}
> > +	/* truncate to 4 bytes for 32-bit effective addresses */
> > +	if (addr_bytes == 4)
> > +		eff_addr &= 0xffffffff;
> 
> Why again?

eff_addr is a long variable, which in x86_64 has 64-bit. However, in
32-bit segments the effective address is 32-bit. Thus, I discard the 32
most significant bytes.

Thanks and BR,
Ricardo