[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2093413.1692377320@warthog.procyon.org.uk>
Date: Fri, 18 Aug 2023 17:48:40 +0100
From: David Howells <dhowells@...hat.com>
To: David Laight <David.Laight@...LAB.COM>
Cc: dhowells@...hat.com,
Linus Torvalds <torvalds@...ux-foundation.org>,
Al Viro <viro@...iv.linux.org.uk>,
Jens Axboe <axboe@...nel.dk>,
"Christoph Hellwig" <hch@...t.de>,
Christian Brauner <christian@...uner.io>,
"Matthew Wilcox" <willy@...radead.org>,
Jeff Layton <jlayton@...nel.org>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
"linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 2/2] iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()
David Laight <David.Laight@...LAB.COM> wrote:
> > iov_iter_init inc 0x27 -> 0x31 +0xa
>
> Are you hitting the gcc bug that loads the constant from memory?
I'm not sure what that looks like. For your perusal, here's a disassembly of
the use-switch-on-enum variant:
0xffffffff8177726c <+0>: cmp $0x1,%esi
0xffffffff8177726f <+3>: jbe 0xffffffff81777273 <iov_iter_init+7>
0xffffffff81777271 <+5>: ud2
0xffffffff81777273 <+7>: test %esi,%esi
0xffffffff81777275 <+9>: movw $0x1,(%rdi)
0xffffffff8177727a <+14>: setne 0x3(%rdi)
0xffffffff8177727e <+18>: xor %eax,%eax
0xffffffff81777280 <+20>: movb $0x0,0x2(%rdi)
0xffffffff81777284 <+24>: movb $0x1,0x4(%rdi)
0xffffffff81777288 <+28>: mov %rax,0x8(%rdi)
0xffffffff8177728c <+32>: mov %rdx,0x10(%rdi)
0xffffffff81777290 <+36>: mov %r8,0x18(%rdi)
0xffffffff81777294 <+40>: mov %rcx,0x20(%rdi)
0xffffffff81777298 <+44>: jmp 0xffffffff81d728a0 <__x86_return_thunk>
versus the use-bitmap variant:
0xffffffff81777311 <+0>: cmp $0x1,%esi
0xffffffff81777314 <+3>: jbe 0xffffffff81777318 <iov_iter_init+7>
0xffffffff81777316 <+5>: ud2
0xffffffff81777318 <+7>: test %esi,%esi
0xffffffff8177731a <+9>: movb $0x2,(%rdi)
0xffffffff8177731d <+12>: setne 0x1(%rdi)
0xffffffff81777321 <+16>: xor %eax,%eax
0xffffffff81777323 <+18>: mov %rdx,0x10(%rdi)
0xffffffff81777327 <+22>: mov %rax,0x8(%rdi)
0xffffffff8177732b <+26>: mov %r8,0x18(%rdi)
0xffffffff8177732f <+30>: mov %rcx,0x20(%rdi)
0xffffffff81777333 <+34>: jmp 0xffffffff81d72960 <__x86_return_thunk>
It seems to be that the former is loading byte constants individually, whereas
Linus combined all those fields into a single byte and eliminated one of them.
David
Powered by blists - more mailing lists