lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 7 Jul 2017 11:16:37 +0200
From:   Jan Kiszka <jan.kiszka@...mens.com>
To:     Leonard Crestez <leonard.crestez@....com>,
        Kieran Bingham <kieran@...uared.org.uk>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/2] scripts/gdb: lx-dmesg: Use explicit encoding=utf8
 errors=replace

On 2017-06-26 14:52, Leonard Crestez wrote:
> Use errors=replace because it is never desirable for lx-dmesg to fail on
> string decoding errors, not even if the log buffer is corrupt and we show
> incorrect info.
> 
> The kernel will sometimes print utf8, for example the copyright symbol from
> jffs2. In order to make this work specify 'utf8' everywhere because python2
> otherwise defaults to 'ascii'.
> 
> In theory the second errors='replace' is not be required because everything
> that can be decoded as utf8 should also be encodable back to utf8. But
> it's better to be extra safe here. It's worth noting that this is
> definitely not true for encoding='ascii', unknown characters are
> replaced with U+FFFD REPLACEMENT CHARACTER and they fail to encode back
> to ascii.
> 
> Signed-off-by: Leonard Crestez <leonard.crestez@....com>
> 
> ---
> Changes since v1:
> * Add encoding='utf8'
> * Only do an explicit encode for python2. On python3 this returns a
> bytes object which formats to b'BLAH' instead.
> * Elaborate commit message explaining what's wrong. The original patch
> was hacked together while debugging something else.
> 
> Link: https://lkml.org/lkml/2017/6/23/405
> Signed-off-by: Leonard Crestez <leonard.crestez@....com>
> ---
>  scripts/gdb/linux/dmesg.py | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/scripts/gdb/linux/dmesg.py b/scripts/gdb/linux/dmesg.py
> index f5a0303..6d2e09a 100644
> --- a/scripts/gdb/linux/dmesg.py
> +++ b/scripts/gdb/linux/dmesg.py
> @@ -12,6 +12,7 @@
>  #
>  
>  import gdb
> +import sys
>  
>  from linux import utils
>  
> @@ -52,13 +53,19 @@ class LxDmesg(gdb.Command):
>                  continue
>  
>              text_len = utils.read_u16(log_buf[pos + 10:pos + 12])
> -            text = log_buf[pos + 16:pos + 16 + text_len].decode()
> +            text = log_buf[pos + 16:pos + 16 + text_len].decode(
> +                encoding='utf8', errors='replace')
>              time_stamp = utils.read_u64(log_buf[pos:pos + 8])
>  
>              for line in text.splitlines():
> -                gdb.write("[{time:12.6f}] {line}\n".format(
> +                msg = u"[{time:12.6f}] {line}\n".format(
>                      time=time_stamp / 1000000000.0,
> -                    line=line))
> +                    line=line)
> +                # With python2 gdb.write will attempt to convert unicode to
> +                # ascii and might fail so pass an utf8-encoded str instead.
> +                if sys.hexversion < 0x03000000:
> +                    msg = msg.encode(encoding='utf8', errors='replace')
> +                gdb.write(msg)
>  
>              pos += length
>  
> 

Acked-by: Jan Kiszka <jan.kiszka@...mens.com>

Andrew, please pick this up.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ