lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 03 Jan 2007 13:17:19 -0600
From:	Steve Wise <swise@...ngridcomputing.com>
To:	"Michael S. Tsirkin" <mst@...lanox.co.il>
Cc:	netdev@...r.kernel.org, Roland Dreier <rdreier@...co.com>,
	linux-kernel@...r.kernel.org, openib-general@...nib.org
Subject: Re: [PATCH  v4 01/13] Linux RDMA Core Changes

> > > > ib_set_cq_udata() would transition into the kernel to pass in the
> > > > consumer's index.  In addition, ib_req_notify_cq would also transition
> > > > into the kernel since its not a bypass function for chelsio.
> > > 
> > > We misunderstand each other.
> > > 
> > > ib_uverbs_req_notify_cq is in drivers/infiniband/core/uverbs_cmd.c -
> > > all this code runs inside the IB_USER_VERBS_CMD_REQ_NOTIFY_CQ command,
> > > so there is a single user to kernel transition.
> > > 
> > 
> > Oh I see. 
> > 
> > This seems like a lot of extra code to avoid passing one extra arg to
> > the driver's req_notify_cq verb.  I'd appreciate other folk's input on
> > how important they think this is.  
> >
> > If you insist, then I'll run some tests specifically in kernel mode and
> > see how this affects mthca's req_notify performance.
> 
> This might be an interesting datapoint.
> 

Here's what I measured:

Without extra param (1000 iterations in cycles): 
	ave 101.283 min 91 max 247
With extra param (1000 iterations in cycles):
	ave 103.311 min 91 max 221

Convert cycles to ns (3466.727 MHz CPU):

Without: 101.283 / 3466.727 = .02922us == 29.22ns
With:    103.311 / 3466.727 = .02980us == 29.80ns

So I measure a .58ns average increase for passing in the additional
parameter.

Here is a snipit of the test:

        spin_lock_irq(&lock);
        do_gettimeofday(&start_tv);
        for (i=0; i<1000; i++) {
                cycles_start[i] = get_cycles();
                ib_req_notify_cq(cb->cq, IB_CQ_NEXT_COMP);
                cycles_stop[i] = get_cycles();
        }
        do_gettimeofday(&stop_tv);
        spin_unlock_irq(&lock);

        if (stop_tv.tv_usec < start_tv.tv_usec) {
                stop_tv.tv_usec += 1000000;
                stop_tv.tv_sec  -= 1;
        }

        for (i=0; i < 1000; i++) {
                cycles_t v = cycles_stop[i] - cycles_start[i];
                sum += v;
                if (v > max)
                        max = v;
                if (min == 0 || v < min)
                        min = v;
        }

        printk(KERN_ERR PFX "FOO delta sec %lu usec %lu sum %llu min %llu max %llu\n",
                stop_tv.tv_sec - start_tv.tv_sec,
                stop_tv.tv_usec - start_tv.tv_usec,
                (unsigned long long)sum, (unsigned long long)min,
                (unsigned long long)max);



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ