lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 6 Jan 2021 17:03:48 -0800
From:   Linus Torvalds <>
To:     Steven Rostedt <>,
        Willem de Bruijn <>,
        Jakub Kicinski <>,
        David Miller <>,
        Jonathan Lemon <>
Cc:     Thomas Gleixner <>,
        LKML <>,
        "the arch/x86 maintainers" <>,
        Christoph Hellwig <>,
        Matthew Wilcox <>,
        Daniel Vetter <>,
        Andrew Morton <>,
        Linux-MM <>,
        Peter Zijlstra <>,
        Ingo Molnar <>,
        Juri Lelli <>,
        Vincent Guittot <>,
        Dietmar Eggemann <>,
        Ben Segall <>, Mel Gorman <>,
        Daniel Bristot de Oliveira <>,
        Netdev <>
Subject: Re: [BUG] from x86: Support kmap_local() forced debugging

On Wed, Jan 6, 2021 at 3:01 PM Steven Rostedt <> wrote:
> I triggered the following crash on x86_32 by simply doing a:
> (ssh'ing into the box)
>   # head -100 /tmp/output-file
> Where the /tmp/output-file was the output of a trace-cmd report.
> Even after rebooting and not running the tracing code, simply doing the
> head command still crashed.

The code decodes to

   0:   3b 5d e8                cmp    -0x18(%ebp),%ebx
   3:   0f 47 5d e8             cmova  -0x18(%ebp),%ebx
   7:   c7 45 e0 00 00 00 00    movl   $0x0,-0x20(%ebp)
   e:   8b 7d e0                mov    -0x20(%ebp),%edi
  11:   39 7d e8                cmp    %edi,-0x18(%ebp)
  14:   76 3a                   jbe    0x50
  16:   8b 45 d4                mov    -0x2c(%ebp),%eax
  19:   e8 a4 e4 ff ff          call   0xffffe4c2
  1e:   8b 55 e4                mov    -0x1c(%ebp),%edx
  21:   03 55 e0                add    -0x20(%ebp),%edx
  24:   89 d9                   mov    %ebx,%ecx
  26:   01 c6                   add    %eax,%esi
  28:   89 d7                   mov    %edx,%edi
  2a:*  f3 a4                   rep movsb %ds:(%esi),%es:(%edi)
 <-- trapping instruction
  2c:   e8 c9 e4 ff ff          call   0xffffe4fa
  31:   01 5d e0                add    %ebx,-0x20(%ebp)
  34:   8b 5d e8                mov    -0x18(%ebp),%ebx
  37:   b8 00 10 00 00          mov    $0x1000,%eax
  3c:   2b 5d e0                sub    -0x20(%ebp),%ebx

and while it would be good to see the output of
scripts/, I strongly suspect that the above is

                                vaddr = kmap_atomic(p);
                                memcpy(to + copied, vaddr + p_off, p_len);

(although I wonder how/why the heck you've enabled
CC_OPTIMIZE_FOR_SIZE=y, which is what causes "memcpy()" to be done as
that "rep movsb". I thought we disabled it because it's so bad on most

So that first "call" instruction is the kmap_atomic(), the "rep movs"
is the memcpy(), and the "call" instruction immediately after is the

Anyway, you can see vaddr in register state:

        EAX: fff57000

so we've kmapped that one page at fff57000, but we're accessing past
it into the next page:

> BUG: unable to handle page fault for address: fff58000

with the current source address being (ESI: fff58000) and we still
have 248 bytes to go (ECX: 000000f8) even though we've already
overflowed into the next page.

You can see the original count still (EBX: 000005a8), so it really
looks like that skb_frag_foreach_page() logic

                                              skb_frag_off(f) + offset - start,
                                              copy, p, p_off, p_len, copied) {
                                vaddr = kmap_atomic(p);
                                memcpy(to + copied, vaddr + p_off, p_len);

must be wrong, and doesn't handle the "each page" part properly. It
must have started in the middle of the page, and p_len (that 0x5a8)
was wrong.

IOW, it really looks like p_off + p_len had the value 0x10f8, which is
larger than one page. And looking at the code, in
skb_frag_foreach_page(), I see:

             p_off = (f_off) & (PAGE_SIZE - 1),                         \
             p_len = skb_frag_must_loop(p) ?                            \
             min_t(u32, f_len, PAGE_SIZE - p_off) : f_len,              \

where that "min_t(u32, f_len, PAGE_SIZE - p_off)" looks correct, but
then presumably skb_frag_must_loop() must be wrong.

Oh, and when I look at that, I see

    static inline bool skb_frag_must_loop(struct page *p)
    #if defined(CONFIG_HIGHMEM)
            if (PageHighMem(p))
                    return true;
            return false;

and that is no longer true. With the kmap debugging, even non-highmem
pages need that "do one page at a time" code, because even non-highmem
pages get remapped by kmap().

IOW, I think the patch to fix this might be something like the attached.

I wonder whether there is other code that "knows" about kmap() only
affecting PageHighmem() pages thing that is no longer true.

Looking at some other code, skb_gro_reset_offset() looks suspiciously
like it also thinks highmem pages are special.

Adding the networking people involved in this area to the cc too.


Download attachment "patch" of type "application/octet-stream" (544 bytes)

Powered by blists - more mailing lists