linux-kernel - Re: [BUG][ext2] XIP does not work on ext2

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAOvWMLbtBa4XSOJytZLR-=YMkF=RUMHZTYT+Gt+ZBKpHTYyw0A@mail.gmail.com>
Date:	Fri, 8 Nov 2013 16:28:15 -0800
From:	Andiry Xu <andiry@...il.com>
To:	Jan Kara <jack@...e.cz>
Cc:	Wang Shilong <wangsl-fnst@...fujitsu.com>,
	linux-kernel@...r.kernel.org, linux-ext4@...r.kernel.org,
	Andiry Xu <andiry.xu@...il.com>
Subject: Re: [BUG][ext2] XIP does not work on ext2

On Thu, Nov 7, 2013 at 2:45 PM, Andiry Xu <andiry@...il.com> wrote:
> On Thu, Nov 7, 2013 at 2:20 PM, Jan Kara <jack@...e.cz> wrote:
>> On Thu 07-11-13 13:50:09, Andiry Xu wrote:
>>> On Thu, Nov 7, 2013 at 1:07 PM, Jan Kara <jack@...e.cz> wrote:
>>> > On Thu 07-11-13 12:14:13, Andiry Xu wrote:
>>> >> On Wed, Nov 6, 2013 at 1:18 PM, Jan Kara <jack@...e.cz> wrote:
>>> >> > On Tue 05-11-13 17:28:35, Andiry Xu wrote:
>>> >> >> >> Do you know the reason why write() outperforms mmap() in some cases? I
>>> >> >> >> know it's not related the thread but I really appreciate if you can
>>> >> >> >> answer my question.
>>> >> >> >   Well, I'm not completely sure. mmap()ed memory always works on page-by-page
>>> >> >> > basis - you first access the page, it gets faulted in and you can further
>>> >> >> > access it. So for small (sub page size) accesses this is a win because you
>>> >> >> > don't have an overhead of syscall and fs write path. For accesses larger
>>> >> >> > than page size the overhead of syscall and some initial checks is well
>>> >> >> > hidden by other things. I guess write() ends up being more efficient
>>> >> >> > because write path taken for each page is somewhat lighter than full page
>>> >> >> > fault. But you'd need to look into perf data to get some hard numbers on
>>> >> >> > where the time is spent.
>>> >> >> >
>>> >> >>
>>> >> >> Thanks for the reply. However I have filled up the whole RAM disk
>>> >> >> before doing the test, i.e. asked the brd driver to allocate all the
>>> >> >> pages initially.
>>> >> >   Well, pages in ramdisk are always present, that's not an issue. But you
>>> >> > will get a page fault to map a particular physical page in process'
>>> >> > virtual address space when you first access that virtual address in the
>>> >> > mapping from the process. The cost of setting up this virtual->physical
>>> >> > mapping is what I'm talking about.
>>> >> >
>>> >>
>>> >> Yes, you are right, there are page faults observed with perf. I
>>> >> misunderstood page fault as copying pages between backing store and
>>> >> physical memory.
>>> >>
>>> >> > If you had a process which first mmaps the file and writes to all pages in
>>> >> > the mapping and *then* measure the cost of another round of writing to the
>>> >> > mapping, I would expect you should see speeds close to those of memory bus.
>>> >> >
>>> >>
>>> >> I've tried this as well. mmap() performance improves but still not as
>>> >> good as write().
>>> >> I used the perf report to compare write() and mmap() applications. For
>>> >> write() version, top of perf report shows as:
>>> >> 33.33%  __copy_user_nocache
>>> >> 4.72%    ext2_get_blocks
>>> >> 4.42%    mutex_unlock
>>> >> 3.59%    __find_get_block
>>> >>
>>> >> which looks reasonable.
>>> >>
>>> >> However, for mmap() version, the perf report looks strange:
>>> >> 94.98% libc-2.15.so       [.] 0x000000000014698d
>>> >> 2.25%   page_fault
>>> >> 0.18%   handle_mm_fault
>>> >>
>>> >> I don't know what the first item is but it took the majority of cycles.
>>> >   The first item means that it's some userspace code in libc. My guess
>>> > would be that it's libc's memcpy() function (or whatever you use to write
>>> > to mmap). How do you access the mmap?
>>> >
>>>
>>> Like this:
>>>
>>> fd = open(file_name, O_CREAT | O_RDWR | O_DIRECT, 0755);
>>> dest = (char *)mmap(NULL, FILE_SIZE, PROT_WRITE, MAP_SHARED, fd, 0);
>>> for (i = 0; i < count; i++)
>>> {
>>>        memcpy(dest, src, request_size);
>>>        dest += request_size;
>>> }
>>   OK, maybe libc memcpy isn't very well optimized for you cpu? Not sure how
>> to tune that though...
>>
>
> Hmm, I will try some different kinds of memcpy to see if there is a
> difference. Just want to make sure I do not make some stupid mistakes
> before trying that.
> Thanks a lot for your help!
>

Your advice does makes difference. I use a optimized version of memcpy
and it does improve the mmap application performance: on a Ramdisk
with Ext2 xip, mmap() version now achieves 11GB/s of bandwidth,
comparing to posix write version with 7GB/s.

Now I wonder if they have a plan to update the memcpy() in libc..

Thanks,
Andiry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/