Remove byte order mark (bom)

I have been doing a lot of localization work. Part of this includes a bunch of SQL scripts encoded using UTF-8. Every now and then, I would get a syntax error on the very first line of the script. The culprit – the byte order mark. Some programs insert this string of characters into the file, others don’t.

Under Linux, it is easy enough to find out if there is a byte order mark in a file. I used the following command on the SQL files in question:

> file filename.sql
filename.sql: UTF-8 Unicode (with BOM) text, with CRLF line terminators

A file without the byte order mark gave me the following:

> file filename.sql
filename.sql: UTF-8 Unicode text, with CRLF line terminators

I wanted a simple way to remove the byte order mark. I found a newsgroup post where Benjamin A’Lee provided a little perl script to do the job. I named the script


$file[0] =~ s/^\xEF\xBB\xBF//;

This little script works like a champ.


April 2013 mensming Twitter Posts

Follow mensming on Twitter

30 Apr
Washington state No. 2 in startup jobs –

29 Apr
Coding, Fast and Slow: Developers and the Psychology of Overconfidence – “

26 Apr
Software engineer third-best job in U.S., says study –

25 Apr
Facebook Lets You Spy on Its Data Centers –

24 Apr
Posted – Call for Presentations – 2014 International Conference on Software Quality –

23 Apr
Finished Reading – The 2 Hour Guide to Mastering Evernote –

22 Apr
DDoS Power Up 718 Percent Amidst Widespread Batterings –

19 Apr
Study: Anxiety and alcohol use linked to Facebook –

18 Apr
Justin Bieber Has the Fakest Twitter Followers –

17 Apr
How might your choice of browser affect your job prospects? – The Economist –

16 Apr
Brute Force Attacks Build WordPress Botnet –

15 Apr
Finished reading – Software Testing – A Complete Handbook by Lakshmi Narayani –

9 Apr
Post – Testing Utility: CatchChar –