String split into n-character long tokens
In this post, I am sharing how to split a string into tokens which are of N characters in length. That might be very handy when it comes to number formatting e.g.
- Separate a number in decimal format with thousands delimiter, from
1233232
to1,233,232
- Separate a number in binary format every fourth digit (a single byte), from
1011010010101001
to1011 0100 1010 1001
- Separate a number in hexadecimal format every second character, from
0xfab22883b0ada0
to0xfa b2 28 83 b0 ad a0
For that purpose we can use the following regular expression .{1,X}(?=(.{X})+(?!.))|.{1,X}$
and string.match()
function, where X
indicates number of characters in a single token.
Let’s split a number in a hexadecimal format into 2-character long tokens. The first token can have 1 or 2 characters.
console.log('fab22883b0ada0'.match(/.{1,2}(?=(.{2})+(?!.))|.{1,2}$/g))
console.log('ab22883b0ada0'.match(/.{1,2}(?=(.{2})+(?!.))|.{1,2}$/g))
That would give like:
['fa', 'b2', '28', '83', 'b0', 'ad', 'a0']
['a', 'b2', '28', '83', 'b0', 'ad', 'a0']
Let’s split a number in a binary format into 4-character long tokens. The first token can have 1 to 4 characters.
console.log('1110101010101101'.match(/.{1,4}(?=(.{4})+(?!.))|.{1,4}$/g))
console.log('10101010101101'.match(/.{1,4}(?=(.{4})+(?!.))|.{1,4}$/g))
That would give like:
['1110', '1010', '1010', '1101']
['10', '1010', '1010', '1101']
Alternatively, you can build a regular expression using RegExp
JavaScript class.
function split(input, len) {
return input.match(new RegExp('.{1,' + len + '}(?=(.{' + len + '})+(?!.))|.{1,' + len + '}$', 'g'))
}
console.log(split('11010101101', 4))
console.log(split('ab22883b0ada0', 2))
That would give like:
['110', '1010', '1101']
['a', 'b2', '28', '83', 'b0', 'ad', 'a0']