Skip to main content

Exploring the Cultural Phenomenon of Emojis through Emojiset Mining

yellow balloons with smiley faces drawn on

Emojis are everywhere. You see them on social media platforms, in text messages and blogs, and they even exist in court cases.

Salem Othman, Ph.D., Assistant Professor of Computer Science and Networking in the School of Computing and Data Science, not only recognized the appeal of emojis on his students, but also the overall changing impact of emojis on how we communicate. In September 2019, he decided to dive deeper into this topic and investigate this new form of human communication through scientific research in the form of Emojiset Mining.  

Othman describes an Emojiset as a set of emoji sequences found within text. For example, the Emojiset contains four sequences of lengths three, one, one, and two respectively. Another Emojiset could show that there are two occurrences of emoji sequences separated by text. The emoji sequences of an Emojiset express language that can be used to understand and translate the message behind the written text.  

Othman developed Emojiset Mining as a technique to extract emoji sequences from text and to find the hidden pattern(s) between emoji sequences in the given Emojiset. His research objectives include automatically extracting Emojisets from a given text, mining their underlying patterns and trends, and understanding how cultural background, demographic characteristics, and individual psychological characteristics influence the use of Emojisets. Additionally, his research goals aim to understand how they are used to convey the semantic meanings in communication and how incorporating Emojisets can improve sentiment analysis, text classification and text clustering, and to understand the underlying structure of Emojisets using co-occurrence networks. 

“Extracting Emojisets is not an easy task to be done automatically,” said Othman. “It is very difficult to extract meaningful emoji sequences from text with a high level of accuracy.” 

His research has involved several Wentworth co-op students and has provided them the opportunity to learn skills needed for industry, as well as within academia if they choose to do graduate studies or pursue a Ph.D.  

“My co-op students have helped in numerous ways from creating the website to turning my ideas and concept for the user interface into a reality,” said Othman. 

Othman has successfully published four papers in well-known conferences with his students as the first authors.  

The Emojiset Mining team created a website that can pull tweets from Twitter and extract their Emojisets. The website provides a small or large dataset of tweets, which can be sent to users via email or downloaded directly from the website. The ongoing research project will impact researchers who lack technical skills and help them pull data from social media platforms in a very simple and convenient way.