lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Mon, 30 Sep 2019 09:51:21 -0300
From:   Arnaldo Carvalho de Melo <arnaldo.melo@...il.com>
To:     Russell King - ARM Linux admin <linux@...linux.org.uk>
Cc:     linux-arm-kernel@...ts.infradead.org,
        Will Deacon <will@...nel.org>, linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>
Subject: Re: perf annotate fails with "Invalid -1 error code"

Em Mon, Sep 30, 2019 at 01:15:37PM +0100, Russell King - ARM Linux admin escreveu:
> Hi,
> 
> While using perf report on aarch64, I try to annotate
> __arch_copy_to_user, and it fails with:
> 
> Error: Couldn't annotate __arch_copy_to_user: Internal error: Invalid -1 error code
> 
> which is not very helpful.  Looking at the code, the error message
> appended to the "Couldn't annotate ...:" comes from
> symbol__strerror_disassemble(), which expects either an errno or
> one of the special SYMBOL_ANNOTATE_ERRNO_* constants in its 3rd
> argument.
> 
> symbol__tui_annotate() passes the 3rd argument as the return value
> from symbol__annotate2().  symbol__annotate2() returns either zero or
> -1.  This calls symbol__annotate(), which returns -1 (which would
> generally conflict with -EPERM), -errno, the return value of
> arch->init, or the return value of symbol__disassemble().
> 
> This seems to be something of a mess - different places seem to use
> different approaches to handling errors, and some don't bother
> propagating the error code up.
> 
> The upshot is, the error message reported when trying to annotate
> gives the user no clue why perf is unable to annotate, and you have
> to resort to stracing perf in an attempt to find out - which also
> isn't useful:
> 
> 3431  pselect6(1, [0], NULL, NULL, NULL, NULL) = 1 (in [0])
> 3431  pselect6(5, [4], NULL, NULL, {tv_sec=10, tv_nsec=0}, NULL) = 1 (in [4], left {tv_sec=9, tv_nsec=999995480})
> 3431  read(4, "\r", 1)                  = 1
> 3431  uname({sysname="Linux", nodename="cex7", ...}) = 0
> 3431  openat(AT_FDCWD, "/usr/lib/aarch64-linux-gnu/gconv/gconv-modules.cache", O_RDONLY) = 26
> 3431  fstat(26, {st_mode=S_IFREG|0644, st_size=26404, ...}) = 0
> 3431  mmap(NULL, 26404, PROT_READ, MAP_SHARED, 26, 0) = 0x7fa1fd9000
> 3431  close(26)                         = 0
> 3431  futex(0x7fa172b830, FUTEX_WAKE_PRIVATE, 2147483647) = 0
> 3431  write(1, "\33[10;21H\33[37m\33[40m\342\224\214\342\224\200Error:\342\224"..., 522) = 522
> 3431  pselect6(1, [0], NULL, NULL, NULL, NULL <detached ...>
> 
> Which makes it rather difficult to know what is actually failing...
> so the only way is to resort to gdb.
> 
> It seems that dso__disassemble_filename() is returning -10000, which
> seems to be SYMBOL_ANNOTATE_ERRNO__NO_VMLINUX and as described above,
> this is lost due to the lack of error code propagation.
> 
> Specifically, the failing statement is:
> 
>         if (dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS &&
>             !dso__is_kcore(dso))
>                 return SYMBOL_ANNOTATE_ERRNO__NO_VMLINUX;
> 
> Looking at "dso" shows:
> 
> 	kernel = DSO_TYPE_KERNEL,
> 	symtab_type = DSO_BINARY_TYPE__KALLSYMS,
> 	binary_type = DSO_BINARY_TYPE__KALLSYMS,
> 	load_errno = DSO_LOAD_ERRNO__MISMATCHING_BUILDID,
> 	name = 0x555588781c "/boot/vmlinux",
> 
> and we finally get to the reason - it's using the wrong vmlinux.
> So, obvious solution (once the failure reason is known), give it
> the correct vmlinux.
> 
> Should it really be necessary to resort to gdb to discover why perf
> is failing?
> 
> It looks like this was introduced by ecda45bd6cfe ("perf annotate:
> Introduce symbol__annotate2 method") which did this:
> 
> -       err = symbol__annotate(sym, map, evsel, 0, &browser.arch);
> +       err = symbol__annotate2(sym, map, evsel, &annotate_browser__opts, &browser.arch);
> 
> +int symbol__annotate2(struct symbol *sym, struct map *map, struct perf_evsel *evsel,
> +                     struct annotation_options *options, struct arch **parch)
> +{
> ...
> +       err = symbol__annotate(sym, map, evsel, 0, parch);
> +       if (err)
> +               goto out_free_offsets;
> ...
> +out_free_offsets:
> +       zfree(&notes->offsets);
> +       return -1;
> +}
> 
> introducing this problem by the "return -1" disease.
> 
> So, given that this function's return value is used as an error code
> in the way I've described above, should this function also be fixed
> to return ENOMEM when the zalloc fails, as well as propagating the
> return value from symbol__annotate() ?
> 
> I haven't yet checked to see if there's other places that call this
> function but now rely on it returning -1... but I'd like to lodge a
> plea that perf gets some consistency wrt how errors are passed and
> propagated from one function to another.

Note taken, will address the points raised here.

- Arnaldo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ