i'm with MJ Welland and R. Alan Leo.
why not use base4 to encode?
you can (much like a amino acid table) have a sequence for 'start' at each end. doesn't matter how long it is really...just a 'tag'
CTAGCTAG would be enough wouldn't it?