Friday, July 20, 2012

Gson vs Jackson - which library is faster?

I've been reading about how jackson is the fastest json parser.  And I wanted to know what exactly that meant.

I've used gson in the past and have never had a problem wth the speed of it, but I've never been in a situation where every bit of speed counts.  

I devised a simple test.  Using junit.

Test 1:
Serialize the same object 100 times with each gson and jackson and time how long it takes.

Results:
gson is between 150 and 200 ms for my object every time.
jackson is between 670 and 720 ms for my object every time.

Clearly here gson is the winner.  However, I wanted to make sure I didn't sell jackson short, maybe it excels in another area.

Test 2:
Serialize both with gson and jackson in the same unit test with gson first in case object creation matters.

Results:
gson is between 150 and 200 ms every time.
jackson is between 620 and 680 ms all but one time which was 901 ms.


Test 3:

Serialize both with gson and jackson in the same unit test with jackson first in case object creation matters.

Results:
gson is between 120and 170 ms every time.
jackson is between 589 and 700 ms every time.



Finally I decided if I'm using it in a web service I'll need to create POJOs from the JSON for processing if there is a post.  

Test 4:
Build POJO from JSON

Result:
gson is between 103 and 141 ms every time.
jackson is between 469 and 546 ms every time.

An interesting thing I found is that the object I was dealing with had a Calendar in it.  This serializes and deserializes just fine in jackson, but in gson it would create a nice output, but did not populate the POJO correctly.  When I changed it to a Date object both Jackson and gson worked correctly.

My final review is that clearly gson is faster than Jackson.  In it's worst case and Jackson's best case gson is still always twice as fast.   I had used gson, just because I found it easy to use, but now that I have run some benchmarks I know I happened to get lucky enough to find the faster library.

Thursday, June 28, 2012

Migrating to Graylog2 0.9.6 issues

We had a problem recently. We had been logging to Graylog2 using the gelf4j appender in a lot of applications. Our MongoDB was configured with a capped collection because we can quickly run our server out of space.

After we upgraded to Graylog2 0.9.6 we noticed a few things.
1. Our server quickly ran out of space
2. The dates were showing up in the year 44461.

Version 0.9.6 is much more responsive, due to the data being stored directly in ElasticSearch, however the capped collections no longer solve the problem of data size. It took some searching before finding that there is a 'settings' tab with a 'message retention time' tab that can be changed from 60 days worth of data to a smaller number.

This doesn't fully solve our data size problem. If a flood of errors occur it can still push the size beyond what the server can handle. Plus this deletion value uses the created_at date that is now passing the wrong data. All data is listed as being in the future and not getting cleaned up with setting.

It took a bit of tinkering, but I was able to finally come up with an ElasticSearch query that would allow me to delete all data dated in the future. This is not the best case scenario, but I can live with manually deleting data until the applications are updated.


To figure out what value to use for the date I used the unix date command to determine the value for 1/1/2013.



Fixing the problem with the dates was much easier.  A new version of gelf4j was available with the fix already implemented.   Simply replacing the previous version with the new version solved the date problem.




Dates with Graylog2 is an issue I saw mentioned several times.


Users complained that their servers are all over the world and the dates are showing local times.   So they want the ability to sort dates differently.


To me this could be solved by always passing through the UTC time instead of a local time.  Then all data would be in order by the time it was sent according to the sending server.


Some users want the graylog server to show dates in the order it receives them instead of by the date order that is stamped on them.


I think that this could be a useful feature, but I don't want it to be the default.  I want the time that the message occurred to be the sorting field.  Some servers could batch up messages and hold them for minutes before sending them.  I want to be able to find problems based on the time that they occur, not whenever Graylog happened to finally get it's copy of the data.    


I believe that if Graylog did add it's own timestamp for when a message was retrieved then deletions could be done against the internal date.  If that were the case Graylog could still hold messages for 2 days from when it received them, regardless of the date-time that the sending server happened to say it was.

Finally, I believe that Graylog2 0.9.6 is a good improvement to the previous version, the web browser is much more responsive, but the inability to cap the index size in ElasticSearch is a big issue that needs to be addressed as soon as possible.

Monday, April 23, 2012

Rotate content with jQuery.

After having a few times that I've had to rotate content I decided to simplify some javascript. There are several plugins out there, but they seem to all be much bigger than I think is needed to simply rotate some content. To start with I have a simple layout for the content. A wrapper element with child elements. The example uses divs with a class, however, an unordered list works just as well with a small tweak to the selectors. The javascript is simple. This is not in a plugin, so a only 1 per page is supported with this code, but it should be able to be converted to a plugin pretty simply. Finally the result is a simple item that fades in and out. The rotation doesn't take control of any of the layout of your page, it simple shows and hides the html. I have a few ideas to clean up the code, it could look for the next sibling and show it instead of keeping a counter, and then show the first on when the last one is hidden. I think the selectors for that would be more complicated though and I like the simplicity that this implementation offers.

Friday, September 30, 2011

Root Cause

I've picked up a fun nickname at work.  Root Cause.   I take it all in fun because I have been the root cause of a few problems, but I cause far less problems than I fix.

Today at work someone sent me a quote.


For every effect there is a root cause. Find and address the root cause rather than try to fix the effect, as there is no end to the latter.
-- Author Unknown

It's almost like there is a quote about me now.  That's lots of fun.

Tuesday, March 22, 2011

Accessing Websphere Variables at runtime.

I'm using Redis and I had a security need to keep the production configuration values hidden.  However, I wanted to allow other developers to run against test with minimal configuration.

I decided that I wanted to be able to read the Webpshere Variables that are configured on the WAS server through code.

I started with some code I found on IBM's site to get the data, but added a simple null check to make it return null if the code is not running on a server.


I had to add a reference to com.ibm.jaxws.thinclient_7.0.0.jar (this was the recommended reference by RAD) this .jar is needed for the AdminService class.

A few quirks
1. if the server is not running (when unit tests run) you have to handle a default value, this method will return null.
2. If the property does not exist on the server it returns in ${propertyname}. 
3. Any changes to the values on the server require a restart before they can be used by the application.

I handled this with a properties file that will be used when running unit tests or on my local server, and the values from Websphere variables will only be used in production.  I structured it in a way that the production values take precedence over the test values.

ResourceBundle settings=ResourceBundle.getBundle("twitter");

// if the values are not set then this will return ${variablename}
String redisServer = expandVariable("REDIS_SERVER");
if (redisServer == null || redisServer.startsWith("${")) {
                redisServer = settings.getString("redis.server");
}

String redisPassword = expandVariable("REDIS_PASSWORD");
if( redisPassword == null || redisPassword.startsWith("${")) {
                redisPassword = settings.getString("redis.password");
}

String redisPort = expandVariable("REDIS_PORT");
if( redisPort == null || redisPort.startsWith("${")) {
                redisPort = settings.getString("redis.port");
}


I still recommend clearing out the test values when deploying to production, this will ensure that if the production property has not been set and the server restarted that the application is not accidentally pointing at test.  I'd rather have an error that a production application pointing at test.

Monday, November 29, 2010

if statement failure.

Looking at an if statement I noticed a problem.   This works, but is technically incorrect.

if( a == null || b==null) {
// do something
} else if( a!=null || b != null) {
// do something else
}

The intention here is really.
if( isSomeState(a,b) ) {
// do something
} else if( !isSomeState(a,b)) {
// do something else
}

which means ultimately it's looking for
(a == null || b==null)
and
!(a == null || b==null)
However !(a == null || b==null) translates into
a != null && b!=null

It happens to work by accident because if either are null the code will go into the if statement and never evaluate the else if portion, however, if the statement were to get more complex intentionally then it could cause all kinds of mistaken errors.

What if the code changed to be

if( a == null || b==null || c==null) {
// do something
} else if( a!=null || b != null) {
// do something else
}

only now the intention is 
if( isSomeState(a,b)  || c==null ) {
// do something
} else if( !isSomeState(a,b)) {
// do something else
}

What was previously be handled by the if statement is no longer handled, and the else if, will handle situations that it did not evaluate before, the something else will happen a lot more than it's supposed to.

When writing complex if statements and the else if is supposed to be the opposite state of the if, then refactor that logic into method and use the method and !method to ensure that the correct logic is used.

Monday, October 04, 2010

How to reliably set the width of an input box.

Input boxes are a pain to get the right length.  There are a few ways you can do it, but only one that is cross browser compatible.

The best way to limit the width is to use the display tag or css.

The css method would be to have a style
.width80 {
 width: 80px;
}


There is a tag that is also supported across browsers, the Size tag this is less reliable.

The size lets the browser know that this text box should big enough to hold 30 characters.   The different browsers interpret this differently.  They have their own internal math that they do and the fonts may be different across browsers, so using the size attribute to define the width of a text box may mean that on one screen the text box is too long and messes with the appearance of the screen that you want to design.

Many times I'm trying to align columns with more of a fixed width, using size to set the width of a text box causes me problems across browsers.  Keep this in mind as you style your own input boxes.