Published at
Updated at
Reading time
3min

When dealing with user-generated content, there's a high chance that you have to deal with strings full of Emojis. Emoji rendering can come with challenges, so you may want to detect when strings include Emojis and replace them with images.

Let's find out how to spot all these cute symbols!

There are Emoji edge cases when using the described Unicode property escapes. Make sure to read to the end of the article!

How to detect Emojis with JavaScript regular expressions?

Luckily, JavaScript regular expressions come with a Unicode mode these days.

There's more to it, though. When you enable Unicode mode in a regular expression, you can use Unicode property escapes. Unicode property escapes (\p{} or \P{}) allow you to match Unicode characters based on their properties and characteristics.

That's right; you can match currency symbols, non-Latin characters, and, you guessed it, Emojis!

Here's an example snippet:

const emojiRegex = /\p{Emoji}/u;
emojiRegex.test('โญ'); // true

// The capital 'p' negates the match
const noEmojiRegex = /\P{Emoji}/u;
noEmojiRegex.test('โญ'); // false

If you want to replace and alter Emojis in JavaScript strings, you can do that with String.replaceAll, too.

// Note the 'g' flag to replace allEmojis
'๐Ÿ™ˆโ€“๐Ÿ‘โ€“โญ'.replaceAll(/\p{Emoji}/ug, '_'); // '_โ€“_โ€“_'

The browser support for for Unicode property escapes looks pretty good, too! ๐ŸŽ‰

MDN Compat Data (source)
Browser support info for Unicode character class escape: `\p{...}`, `\P{...}`
chromechrome_androidedgefirefoxfirefox_androidsafarisafari_iossamsunginternet_androidwebview_android
646479787811.111.19.064

Unfortunately, as always, it's more complicated than that. Before going all in with \p{Emoji}, let's dig deeper!

But what does count as Emoji?

After publishing this blog post someone reached out to point out that \p{Emoji} is matching digets and other characters, too. ๐Ÿ˜ฒ

const emojiRegex = /\p{Emoji}/u;
emojiRegex.test('1'); // true
emojiRegex.test('*'); // true
emojiRegex.test('#'); // true

You propably don't want to include these codepoints in your Emoji detection because they're usually displayed as a normal text-based characters.

What counts as Emoji and what doesn't, then?

I'd say every tiny rendered comic icon counts, but unfortunately Emoji rendering depends on the operating system and the installed fonts. Just because you see a cute Emoji in front of you, it doesn't mean that someone else sees it, too.

And to make it more complicated: just because you see one rendered Emoji image, it doesn't mean that it's a single codepoint. It can be a combination of multiple Emojis and special characters.

If you have comments on Emojis detection in JavaScript, please give me a shoutout on Twitter or write me a good old email. I'm keen on learning more about it!

Mathias Bynes pointed out that there are shortcomings with this approach of Emoji detection. A property escape such as \p{Emoji} matches every single Emoji code point and this can be a problem.

Let's have a look at an example:

"๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ".replaceAll(/\p{Emoji}/gu, '-'); // '----'

Various Emojis, such as the "Family" one, are rendered as a single symbol but consist of more than one code point. Unicode property escapes match every one of them so that you might run into unexpected behavior.

If you wonder what could count as an Emoji have a look at this extensive list.

There's a reason why Mathias' emoji-regex package has 49 million weekly downloads, so make sure to check it out!

If you enjoyed this article...

Join 5.5k readers and learn something new every week with Web Weekly.

Web Weekly โ€” Your friendly Web Dev newsletter
Reply to this post and share your thoughts via good old email.
Stefan standing in the park in front of a green background

About Stefan Judis

Frontend nerd with over ten years of experience, freelance dev, "Today I Learned" blogger, conference speaker, and Open Source maintainer.

Related Topics

Related Articles