lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 14 Jan 2014 15:30:11 +0200
From:	Sergey Meirovich <rathamahata@...il.com>
To:	Christoph Hellwig <hch@...radead.org>, xfs@....sgi.com
Cc:	Jan Kara <jack@...e.cz>, linux-scsi <linux-scsi@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Gluk <git.user@...il.com>
Subject: Re: Terrible performance of sequential O_DIRECT 4k writes in SAN
 environment. ~3 times slower then Solars 10 with the same HBA/Storage.

Hi Cristoph,

On 8 January 2014 16:03, Christoph Hellwig <hch@...radead.org> wrote:
> On Tue, Jan 07, 2014 at 08:37:23PM +0200, Sergey Meirovich wrote:
>> Actually my initial report (14.67Mb/sec  3755.41 Requests/sec) was about ext4
>> However I have tried XFS as well. It was a bit slower than ext4 on all
>> occasions.
>
> I wasn't trying to say XFS fixes your problem, but that we could
> implement appending AIO writes in XFS fairly easily.
>
> To verify Jan's theory, can you try to preallocate the file to the full
> size and then run the benchmark by doing a:
>
> # fallocate -l <size> <filename>
>
> and then run it?  If that's indeed the issue I'd be happy to implement
> the "real aio" append support for you as well.
>

I've resorted to write simple wrapper around io_submit() and ran it
against preallocated file (exactly to avoid append AIO scenario).
Random data was used to avoid XtremIO online deduplication but results
were still wonderfull for 4k sequential AIO write:

744.77 MB/s   190660.17 Req/sec

Clearly Linux lacks "rial aio" append to be available for any FS.
Seems that you are thinking that it would be relatively easy to
implement it for XFS on Linux? If so - I will really appreciate your
afford.

[root@...-poc-gtsxdb3 mnt]# dd if=/dev/zero of=4k.data bs=4096 count=524288
524288+0 records in
524288+0 records out
2147483648 bytes (2.1 GB) copied, 5.75357 s, 373 MB/s
[root@...-poc-gtsxdb3 mnt]# /root/4k
rnd generation (sec.):    195.63
io_submit() accepted 524288 IOs
io_getevents() returned 524288 events
time elapsed (sec.):        2.75
bandwidth (MiB/s):        744.77
IOps:                                     190660.17
[root@...-poc-gtsxdb3 mnt]#

========================== io_submit() wrapper =============================
#define _GNU_SOURCE

#include <errno.h>
#include <libaio.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <unistd.h>
#include <sys/time.h>


#define FNAME           "4k.data"
#define IOSIZE          4096
#define    REQUESTS    524288

/*  gcc 4k.c -std=gnu99 -laio -o 4k */

int main(void) {
    io_context_t ctx;
    int ret;

    int flag = O_RDWR | O_DIRECT;
    int fd = open(FNAME, flag);
    struct timeval start, end;
        if (fd == -1) {
        printf("open(%s, %d) - failed!\nExiting.\n"
        "If file doesn't exist please precreate it "
        "with dd if=/dev/zero of=%s bs=%d count=%d\n",
        FNAME, flag, FNAME, IOSIZE, REQUESTS);
                return errno;
        }

    memset(&ctx, 0, sizeof(io_context_t));
    if (io_setup(REQUESTS, &ctx)) {
        printf("io_setup(%d, &ctx) failed\n", REQUESTS);
        return -ret;
    }

    void *mem = NULL;
    posix_memalign(&mem, 4096, (size_t) IOSIZE * REQUESTS);
    /* memset(mem, 9, IOSIZE); */
    int urnd = open("/dev/urandom", O_RDONLY);
    void *cur = mem;
    gettimeofday(&start, NULL);
    for (int i = 0;  i < REQUESTS; i++, cur += IOSIZE) {
        read(urnd, cur, IOSIZE);
    }
    gettimeofday(&end, NULL);
    close(urnd);
    double elapsed = (end.tv_sec - start.tv_sec) +
              ((end.tv_usec - start.tv_usec)/1000000.0);
    printf("rnd generation (sec.):\t%.2f\n", elapsed);

    struct iocb *aio = malloc(sizeof(struct iocb) * REQUESTS);
    memset(aio, 0, sizeof(struct iocb) * REQUESTS);
    struct iocb **lio = malloc(sizeof(void *) * REQUESTS);
    memset(lio, 0, sizeof(void *) * REQUESTS);
    struct io_event *event = malloc(sizeof(struct io_event) * REQUESTS);
    memset(event, 0, sizeof(struct io_event) * REQUESTS);

    cur = mem;
    for (int i = 0; i < REQUESTS; i++, cur += IOSIZE) {
        io_prep_pwrite(&aio[i], fd, cur, IOSIZE, i * IOSIZE);
        lio[i] = &aio[i];
    }
    gettimeofday(&start, NULL);
    ret = io_submit(ctx, REQUESTS, lio);
    printf("io_submit() accepted %d IOs\n", ret);
    fdatasync(fd);

    ret = io_getevents(ctx, REQUESTS, REQUESTS, event, NULL);
    printf("io_getevents() returned %d events\n", ret);
    gettimeofday(&end, NULL);

    elapsed = (end.tv_sec - start.tv_sec) +
              ((end.tv_usec - start.tv_usec)/1000000.0);
    printf("time elapsed (sec.):\t%.2f\n", elapsed);
        printf("bandwidth (MiB/s):\t%.2f\n",
        (double) (((long long) IOSIZE * REQUESTS) / (1024 * 1024))
            / elapsed);
        printf("IOps:\t\t\t%.2f\n", (double) REQUESTS
            / elapsed);

    if (io_destroy(ctx)) {
                perror("io_destroy");
                return -1;
        }
    close(fd);
    free(mem);
    free(aio);
    free(lio);
    free(event);

    return 0;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ