[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <136370.38015.qm@web114316.mail.gq1.yahoo.com>
Date: Fri, 16 Apr 2010 10:02:30 -0700 (PDT)
From: Rick Sherm <rick.sherm@...oo.com>
To: linux-kernel@...r.kernel.org, axboe@...nel.dk
Subject: Trying to measure performance with splice/vmsplice ....
Hello,
I'm trying to measure the perf gain by using splice.For now I'm trying to copy a 1G file using splice.(In real scenario, the driver will DMA the data to some buffer(which is mmap'd).The app will then write the newly-DMA'd data to the disk while some other thread is crunching the same buffer.The buffer is guaranteed to not be modified.To avoid copying I was thinking of : splice-IN-mmap'd-buffer->pipe and splice-OUT-pipe->file.)
PS - I've inlined some sloppy code that I cooked up.
Case1) read from input_file and write(O_DIRECT so no buff-cache is involved but it doesn't work) to dest_file.We can talk about the buff-cache later.
(csh#)time ./splice_to_splice
0.004u 1.451s 0:02.16 67.1% 0+0k 2097152+2097152io 0pf+0w
#define KILO_BYTE (1024)
#define PIPE_SIZE (64 * KILO_BYTE)
int filedes [2];
pipe (filedes);
fd_from = open(filename_from,O_RDWR|O_LARGEFILE|O_DIRECT),0777);
fd_to = open(filename_to,(O_WRONLY|O_CREAT|O_LARGEFILE|O_DIRECT),0777);
to_write = 2048 * 512 * KILO_BYTE;
while (to_write) {
ret = splice (fd_from, &from_offset, filedes [1], NULL, PIPE_SIZE,
SPLICE_F_MORE | SPLICE_F_MOVE);
if (ret < 0) {
printf("Error: LINE:%d ret:%d\n",__LINE__,ret);
goto error;
} else {
ret = splice (filedes [0], NULL, fd_to,
&to_offset, PIPE_SIZE/*should be ret,but ...*/,
SPLICE_F_MORE | SPLICE_F_MOVE);
if (ret < 0) {
printf("Error: LINE:%d ret:%d\n",__LINE__);
goto error;
}
to_write -= ret;
}
}
Case 2) directly reading and writing:
Case2.1) copy 64K blocks
(csh#)time ./file_to_file 64
0.015u 1.066s 0:04.04 26.4% 0+0k 2097152+2097152io 0pf+0w
#define KILO_BYTE (1024)
#define MEGA_BYTE (1024 * (KILO_BYTE))
#define BUFF_SIZE (64 * MEGA_BYTE)
posix_memalign((void**)&buff,4096,BUFF_SIZE);
fd_from = open(filename_from,(O_RDWR|O_LARGEFILE|O_DIRECT),0777);
fd_to = open(filename_to,(O_WRONLY|O_CREAT|O_LARGEFILE|O_DIRECT),0777);
/* 1G file == 2048 * 512K blocks */
to_write = 2048 * 512 * KILO_BYTE;
copy_size = cmd_line_input * KILO_BYTE; /* control from cmd_line */
while (to_write) {
ret = read(fd_from, buff,copy_size);
if (ret != copy_size) {
printf("Error: LINE:%d ret:%d\n",__LINE__,ret);
goto error;
} else {
ret = write (fd_to,buff,copy_size);
if (ret != copy_size) {
printf("Error: LINE:%d ret:%d\n",__LINE__);
goto error;
}
to_write -= ret;
}
}
Case2.2) copy 512K blocks
(csh#)time ./file_to_file 512
0.004u 0.306s 0:01.86 16.1% 0+0k 2097152+2097152io 0pf+0w
Case 2.3) copy 1M blocks
time ./file_to_file 1024
0.000u 0.240s 0:01.88 12.7% 0+0k 2097152+2097152io 0pf+0w
Questions:
Q1) When using splice,why is the CPU consumption greater than read/write(case 2.1)?What does this mean?
Q2) How do I confirm that the memory bandwidth consumption does not spike up when using splice in this case? By this I mean, (node)cpu<->mem. The DMA-in/DMA-out will happen.You can't escape from that but the IOH-bus will be utilized. I want to keep the cpu(node)-mem path free(well, minimize unnecessary copies).
Q3) When using splice, even though the destination file is opened in O_DIRECT mode, the data gets cached. I verified it using vmstat.
r b swpd free buff cache
1 0 0 9358820 116576 2100904
./splice_to_splice
r b swpd free buff cache
2 0 0 7228908 116576 4198164
I see the same caching issue even if I vmsplice buffers(simple malloc'd iov) to a pipe and then splice the pipe to a file. The speed is still an issue with vmsplice too.
Q4) Also, using splice, you can only transfer 64K worth of data(PIPE_BUFFERS*PAGE_SIZE) at a time,correct?.But using stock read/write, I can go upto 1MB buffer. After that I don't see any gain. But still the reduction in system/cpu time is significant.
I would appreciate any pointers.
thanks
Rick
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists