I did quite a lot of progress this week. I profiled the cpu usage of the entire plugin using google’s performance tools  and mostly found that the cache statistics were causing some performance regression. I had an idea about this as they lock a mutex to increment the request counters. Since the last week I had been working on a new timer abstraction in the cache plugin for doing specific tasks based after every interval. Its landed now, it uses timerfd functionality in linux, and registers it with the monkey scheduler to get ticks after every certain time (currently one sec). I refactored all the statistics related code in a new file, and this time only thread local counters are updated on every request, and after every timers tic, the global counters are updated. And as the timer only fired in a single thread whichever can read the timerfd first) no mutexes are needed. The performance is now back to where it used to be and there is be no performance regression when serving requests with monkey plugin or without., yet the plugin is a a lot more complex, and does a lot more.
I also added file eviction functionality with the timer infrastructure in place, so the idle files which are evictable (can be opened again, not the case for custom overlays added through the plugin external api) are evicted. This should automatically reduce the footprint of plugin in idle time. I am also gonna soon shift the request and pipe pools to the new technique so the limits could automatically adjust based on statistics during a specific period. And cool down to zero extra footprint.
I fixed quite a lot of bugs, mainly with request errors and the http pipelined requests. I also experimented this week with sending out the raw buffers directly using write (using mmaped buffers or just raw ones) and I could only get performance regressions. I was planning to use a key value store to maintain raw file buffers in memory, but it turns out to be slower to push buffers to socket then to just splicing them or using sendfile over a cached fd (fd which is probably cached in kernel memory)
for the next week, I will stabilize the entire codebase, add more testing and make everything configurable. I will try to reduce configuration as less as possible and try to make the system automatically adjust to the optimal levels (now that the timer infrastructure is added and it can learn as it goes). I will continue the performance profiling to decrease squeeze more performance out of it, and also find a fix bugs. I have had an awesome time embarking on this project which taught me so much, way ahead of anything I had done in the low level programming arena.
Github Project: https://github.com/ziahamza/monkey-cache