Fixing Binary File Corruption from Ant Copies

So there I was, poking around in some java / j2ee code, trying to learn how it all works. I did some testing on a Linux server and realized, something is broken. It seemed something was corrupting ALL the Jar files in WEB-INF/lib/.

A co-worker guessed that the token filtering Ant was doing might be the culprit. He was right. It seems Ant has issues with detecting whether files are binary or not, given that it uses a Reader class which runs the files through a character decoder. This is specifically a problem on Unix systems since they commonly use UTF8 character set, and in that case Ant hasn’t a clue if its looking at UTF8 text or binary data.

So, I used a trick suggested in the Ant docs:

Another trick is to change the LANG environment variable from something like “us.utf8” to “us”.

On the Linux box, this meant we had a default character set of en_US.UTF-8 and it needed to be en_US. In this case we already had a bash shell script that runs ant, so adding a line export LANG=en_US to that script before Ant ran solved the problem.

Reference:

http://ant.apache.org/manual/CoreTasks/copy.html#encoding