[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5240622A.5010305@akamai.com>
Date: Mon, 23 Sep 2013 11:45:46 -0400
From: Jason Baron <jbaron@...mai.com>
To: Eric Wong <normalperson@...t.net>
CC: Jason Baron <jbaron@...mai.com>, Nathan Zimmer <nzimmer@....com>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
Al Viro <viro@...iv.linux.org.uk>
Subject: Re: [RFC] eventpoll: Move a kmem_cache_alloc and kmem_cache_free
On 09/22/2013 04:41 PM, Eric Wong wrote:
> Jason Baron <jbaron@...mai.com> wrote:
>> epoll: reduce usage of global 'epmutex' lock
>>
>> Epoll file descriptors that are 1 link from a wakeup source and
>> are not nested within other epoll descriptors, or pointing to
>> other epoll descriptors, don't need to check for loop creation or
>> the creation of wakeup storms. Because of this we can avoid taking
>> the global 'epmutex' in these cases. This state for the epoll file
>> descriptor is marked as 'EVENTPOLL_BASIC'. Once the epoll file
>> descriptor is attached to another epoll file descriptor it is
>> labeled as 'EVENTPOLL_COMPLEX', and full loop checking and wakeup
>> storm creation are checked using the the global 'epmutex'. It does
>> not transition back. Hopefully, this is a common usecase...
>
> Cool. I was thinking about doing the same thing down the line (for
> EPOLL_CTL_ADD, too)
>
>> @@ -166,6 +167,14 @@ struct epitem {
>>
>> /* The structure that describe the interested events and the source fd */
>> struct epoll_event event;
>> +
>> + /* TODO: really necessary? */
>> + int on_list;
>
> There's some things we can overload to avoid increasing epitem size
> (.ep, .ffd.fd, ...), so on_list should be unnecessary.
Even with 'on_list' the size of 'epitem' stayed at 128 bytes. Not sure if
there are certain compile options though that can move it over that you
are concerned about...so I think that change is ok.
The biggest hack here was using 'struct rb_node' instead of a proper
'struct rcu_head', so as not to increase the size of epitem. I think this
is safe and I've added build time checks to ensure that 'struct rb_node'
is never smaller than 'struct rcu_head'. But its rather hacky. I will
probably break this change out separately when I re-post so it can be
reviewed independently...
Thanks,
-Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists