Skip to content Skip to sidebar Skip to footer

String Split With Unicode

First off I been searching the web for this solution. How to: <''.split(''); > ['','',''] Simply express of what I'll like to do. But also with other Unic

Solution 1:

As explained in JavaScript has a Unicode problem, in ES6 you can do this quite easily by using the new ... spread operator. This causes the string iterator (another new ES6 feature) to be used internally, and because that iterator is designed to deal with code points rather than UCS-2/UTF-16 code units, it works the way you want:

console.log([...'💩💩']);
// → ['💩', '💩']

Try it out here: https://babeljs.io/repl/#?experimental=true&evaluate=true&loose=false&spec=false&code=console.log%28%0A%20%20%5B%2e%2e%2e%27%F0%9F%92%A9%F0%9F%92%A9%27%5D%0A%29%3B

A more generic solution:

functionsplitStringByCodePoint(string) {
  return [...string];
}

console.log(splitStringByCodePoint('💩💩'));
// → ['💩', '💩']

Solution 2:

for ... of could loop through string contains unicode characters,

letstring = "😀😃😄😁😆😅🤣😂🙂🙃😉😊😇"for(var c ofstring)
    console.log(c);

Solution 3:

The above solutions work well for simple emojis, but not for the one from an extended set and the ones that use Surrogate Pairs

For example:

splitStringByCodePoint("❤️")
//Returns: [ "❤", "️" ]

To handle these cases properly you'll need a purpose-built library, like for example:

https://github.com/dotcypress/runes

https://github.com/essdot/spliddit

Post a Comment for "String Split With Unicode"