15 August 2014

My Top 10 Open Source People and Projects.

This is ordered in terms of how much I appreciate them..

I will not reveal if this list is in ascending or descending order :)

and my main evaluation criteria

a. is it working
b. is it useful
c. is it performing
d. is it documented
e. is it supported
f. is it simple



1. Curator (Jordan Zimmerman -- Netflix )

"guava is to Java what curator is to zookeeper."
I believe this statement just perfectly states what curator is..
Simple by providing examples, and you are confident to use by knowing that it is part of Netflix OSS  which is accounting for 34.2 downstream traffic in US.
For this project,  simplifies Zookeeper usage by wrapping it and providing an abstraction over it you even do not need reading documentation.
Although Jordan is the only person providing answers to the questions, you get them very quick whenever you raise smthg in the mail list.

2. Netty (Norman Maurer)

This simplifies network programming. very strong and very powerful in performance. Many(like storm, hornetq) open source projects already using it. It has a huge detailed documentation and Norman keep working on tuning it.


3. Zookeeper(Yahoo Engineering -- Apache 2 - Benjamin Reed& Flavio Junqueira)

This tool is a great problem solver and pain killer in distributed environment. Every problem you will meet is highly likely already met and solved. Gives you coordination ability.There is a big commiter list behind but I like these 2 guy because of the  book they published.


4. Esper (Thomas Bernhard - Gpl2)

I believe this is the number one CEP tool exists. Simple to use. very well documented.
And if it does not use any reflection Ill definitely put this one to the top of the list.
However performance wise it has a good bit room to go. May not be the perfect match for real time data analytic area if you target high(million per second event rate.) In that sense I would suggest you to write one according to exactly your needs. Implement your sliding window, pattern matching, filtering , enrichment algorithms in the way you like and just suiting your use case.
On the other hand if performance is not your number one requirement or you are ok to use more hardware, then use esper with its support license.


5. Hazelcast (Fuad Malikov)
I like to use publish subscribe model of the hazelcast. dead simple pain free tool developed by very skilled one.

6. Yammer - http://metrics.codahale.com/ - (Ryan Tenney )
Must tool for your instrumentation requirement.

7. logback (Ceki Gulcu)
I cannot imagine logging without slf4j, log4j and updated log4j => logback.
which means I cannot imagine logging without Ceki Gulcu's contributions.

8. Storm( Nathan Marz Backtype, Twitter )
if you are just starting to develop distributed real time application, you have a huge and long way to go.
But getting storm, and using it is dead simple. Even only you use it for prototype or testing reason this has a very good educational experience as well. Even you do not use storm just follow Nathans blog and twitter account :), he has a very strong mindset that can direct you better.
My favorite posts of him are
1. http://nathanmarz.com/blog/suffering-oriented-programming.html
2. http://nathanmarz.com/blog/the-mathematics-behind-hadoop-based-systems.html
3. http://nathanmarz.com/blog/interview-with-programmer-magazine.html

9. Exhibitor(Jordan Zimmerman - Netflix)

I d prefer to use this for monitoring zookeeper nodes and make sure they are running.
but you can do more.

Why I do like this project is when you are using Zookeeper , the main challenge is to explain what is it to the non technical people or semi technical people . However when you open exhibitor and show the zookeeper tree structured db, and telling them each jvm in your cluster is seeing this picture and this is only stored in a central replicated place then it will make sense, and doing this with a single command line is just priceless.

10. Guava (Joshua Bloch - Google Engineering )

Especially in software I do not like components or libraries that responsible more than one thing. But guava is like a big Util class which provides you all the helper functionality, controlling null, cache, collection api, string operations etc..
and Also knowing that Josuha Bloch, author of the Effective Java ,is influencing the feature set and implementation of it , makes  you extra confident to choose it in your dependency list.


btw I should note that all of these are Apache2 except Esper being GPLv2.

No comments: