Friday, July 20, 2012

Gson vs Jackson - which library is faster?

I've been reading about how jackson is the fastest json parser.  And I wanted to know what exactly that meant.

I've used gson in the past and have never had a problem wth the speed of it, but I've never been in a situation where every bit of speed counts.  

I devised a simple test.  Using junit.

Test 1:
Serialize the same object 100 times with each gson and jackson and time how long it takes.

Results:
gson is between 150 and 200 ms for my object every time.
jackson is between 670 and 720 ms for my object every time.

Clearly here gson is the winner.  However, I wanted to make sure I didn't sell jackson short, maybe it excels in another area.

Test 2:
Serialize both with gson and jackson in the same unit test with gson first in case object creation matters.

Results:
gson is between 150 and 200 ms every time.
jackson is between 620 and 680 ms all but one time which was 901 ms.


Test 3:

Serialize both with gson and jackson in the same unit test with jackson first in case object creation matters.

Results:
gson is between 120and 170 ms every time.
jackson is between 589 and 700 ms every time.



Finally I decided if I'm using it in a web service I'll need to create POJOs from the JSON for processing if there is a post.  

Test 4:
Build POJO from JSON

Result:
gson is between 103 and 141 ms every time.
jackson is between 469 and 546 ms every time.

An interesting thing I found is that the object I was dealing with had a Calendar in it.  This serializes and deserializes just fine in jackson, but in gson it would create a nice output, but did not populate the POJO correctly.  When I changed it to a Date object both Jackson and gson worked correctly.

My final review is that clearly gson is faster than Jackson.  In it's worst case and Jackson's best case gson is still always twice as fast.   I had used gson, just because I found it easy to use, but now that I have run some benchmarks I know I happened to get lucky enough to find the faster library.

Thursday, June 28, 2012

Migrating to Graylog2 0.9.6 issues

We had a problem recently. We had been logging to Graylog2 using the gelf4j appender in a lot of applications. Our MongoDB was configured with a capped collection because we can quickly run our server out of space.

After we upgraded to Graylog2 0.9.6 we noticed a few things.
1. Our server quickly ran out of space
2. The dates were showing up in the year 44461.

Version 0.9.6 is much more responsive, due to the data being stored directly in ElasticSearch, however the capped collections no longer solve the problem of data size. It took some searching before finding that there is a 'settings' tab with a 'message retention time' tab that can be changed from 60 days worth of data to a smaller number.

This doesn't fully solve our data size problem. If a flood of errors occur it can still push the size beyond what the server can handle. Plus this deletion value uses the created_at date that is now passing the wrong data. All data is listed as being in the future and not getting cleaned up with setting.

It took a bit of tinkering, but I was able to finally come up with an ElasticSearch query that would allow me to delete all data dated in the future. This is not the best case scenario, but I can live with manually deleting data until the applications are updated.


To figure out what value to use for the date I used the unix date command to determine the value for 1/1/2013.



Fixing the problem with the dates was much easier.  A new version of gelf4j was available with the fix already implemented.   Simply replacing the previous version with the new version solved the date problem.




Dates with Graylog2 is an issue I saw mentioned several times.


Users complained that their servers are all over the world and the dates are showing local times.   So they want the ability to sort dates differently.


To me this could be solved by always passing through the UTC time instead of a local time.  Then all data would be in order by the time it was sent according to the sending server.


Some users want the graylog server to show dates in the order it receives them instead of by the date order that is stamped on them.


I think that this could be a useful feature, but I don't want it to be the default.  I want the time that the message occurred to be the sorting field.  Some servers could batch up messages and hold them for minutes before sending them.  I want to be able to find problems based on the time that they occur, not whenever Graylog happened to finally get it's copy of the data.    


I believe that if Graylog did add it's own timestamp for when a message was retrieved then deletions could be done against the internal date.  If that were the case Graylog could still hold messages for 2 days from when it received them, regardless of the date-time that the sending server happened to say it was.

Finally, I believe that Graylog2 0.9.6 is a good improvement to the previous version, the web browser is much more responsive, but the inability to cap the index size in ElasticSearch is a big issue that needs to be addressed as soon as possible.

Monday, April 23, 2012

Rotate content with jQuery.

After having a few times that I've had to rotate content I decided to simplify some javascript. There are several plugins out there, but they seem to all be much bigger than I think is needed to simply rotate some content. To start with I have a simple layout for the content. A wrapper element with child elements. The example uses divs with a class, however, an unordered list works just as well with a small tweak to the selectors. The javascript is simple. This is not in a plugin, so a only 1 per page is supported with this code, but it should be able to be converted to a plugin pretty simply. Finally the result is a simple item that fades in and out. The rotation doesn't take control of any of the layout of your page, it simple shows and hides the html. I have a few ideas to clean up the code, it could look for the next sibling and show it instead of keeping a counter, and then show the first on when the last one is hidden. I think the selectors for that would be more complicated though and I like the simplicity that this implementation offers.