Document Type : Research Paper
Authors
1 Master of Computational Qur'an Mining, Interdisciplinary Qur’anic Studies Research Institute, Shahid Beheshti University, Tehran, Iran
2 Associate Professor, Computer Science and Engineering Department, Shahid Beheshti University, Tehran, Iran
3 Assistant Professor, Interdisciplinary Quranic Studies Research Institute, Shahid Beheshti University, Tehran, Iran
Abstract
Keywords
Emotions, defined as relatively brief responses to external or internal events perceived as significant (Martin 2009), pose a complex challenge in natural language processing (NLP). The task involves identifying underlying emotions expressed in textual data, an increasingly important area of research due to the massive volume of generated texts.
Automated emotion detection has diverse practical applications, spanning customer care, healthcare, religious education, and more. In customer service, employing decision support systems (Kratzwald et al. 2018) allows real-time analysis of customer messages, adjusting responses to address their needs promptly. This enhances understanding, satisfaction, and loyalty. For instance, if a customer expresses frustration, prompt response can prevent dissatisfaction and potential customer loss. Emotion recognition in text facilitates personalized and empathetic customer interactions, improving overall satisfaction and loyalty (Kusal et al. 2022).
Beyond customer service, emotion detection from text has applications in financial decision support systems, education, mental health monitoring, and social media analysis. It aids in sentiment analysis related to investments, provides personalized support in education, detects early signs of mental health issues, and helps identify harmful behaviors in social media. In healthcare, emotion detection contributes to mental health diagnosis, patient well-being monitoring, and enhanced communication between patients and healthcare providers (Dhuheir et al. 2021).
Despite the complexity of human emotions and the nuances of language, recent advances in machine learning, especially transformer-based models like BERT, have achieved state-of-the-art results. These models, capable of capturing intricate word-context relationships, offer improved performance across various NLP applications.
We want to focus on emotion detection from Qur’an’s text. The Qur’an, revered as the holy book in Islam, holds a unique and central position in the lives of millions worldwide. Beyond its religious significance, the Qur’an encompasses a wealth of human experiences, emotions, and profound wisdom. Understanding the emotions expressed in its verses can provide a deeper insight into the human aspects of the divine message. The Qur’an, while delivering spiritual guidance, also delves into the various dimensions of human experience. It narrates stories of joy, sorrow, fear, anger, and repentance. Recognizing and understanding these emotions is crucial for believers seeking a comprehensive understanding of the human condition as depicted in the divine revelations. Emotions serve as a powerful means of connecting with the spiritual teachings of the Qur’an on a personal level. Detecting and interpreting the emotional nuances in the verses allows individuals to relate to the scripture in a more profound way, fostering a personalized and heartfelt connection with the divine message. Emotion detection in the Qur’an aids scholars, students, and believers in interpreting the verses more comprehensively. It provides a nuanced understanding of the emotional states conveyed in different contexts, enriching the interpretation process. Reflecting on these emotions encourages a more profound engagement with the sacred text. Emotion detection can also have practical applications in pastoral care and counseling within Islamic communities. Identifying emotional themes in the Qur’anic verses enables religious leaders and counselors to offer guidance and support that resonates with the emotional struggles and triumphs of individuals. The Qur’an addresses not only the spiritual realm but also the emotional well-being of individuals. By detecting and understanding emotions within its verses, one can draw insights into promoting emotional resilience, finding solace, and seeking strength during challenging times. The Qur’an's universality extends across diverse cultures and languages. Emotion detection in its verses can aid in bridging cultural gaps by revealing common emotional threads that connect humanity. This inclusivity promotes a deeper appreciation of the Qur’an's relevance to people from various cultural backgrounds.
This paper focuses on combining Parts of Speech (POS) and Dependency Parsing tags as syntactical features with transformer-based models for emotion detection from text. A literature review, method proposal, and performance comparison with top models on the Isear dataset are presented. Our study underscores the significance of syntactical features in enhancing emotion detection accuracy. By showcasing their effectiveness in transformer-based language models, we aim to contribute to the development of more accurate and reliable systems for emotion detection from text. Finally, employing emotion detection techniques in the Qur’an serves as a gateway to unlocking profound insights into the human experience as portrayed in divine revelations. It enhances personal connection, aids interpretation, supports pastoral care, and contributes to the emotional well-being of believers. The Qur’an's timeless wisdom is not only a source of spiritual guidance but also a testament to the intricate tapestry of human emotions as recognized by the divine.
In a study by Polignano et al. (2019), a novel model was proposed, integrating Bi-LSTM, CNN deep neural networks, and self-attention mechanisms. Evaluation on three datasets involved varying word embeddings—Google word embeddings (GoogleEmb), GloVe (GloVeEmb), and FastText (FastTextEmb). The results demonstrated the superiority of the proposed model, particularly when utilizing FastText word embeddings, showcasing its effectiveness in emotion identification.
Adoma, Henry and Chen (2020) explored the use of pretrained language models, including BERT, RoBERTa, XLNet, and DistilBert, for emotion detection on the Isear dataset. Their findings revealed that RoBERTa outperformed the others, achieving a recognition accuracy of 0.7431. Precision, recall, and F1-score analyses further supported RoBERTa as an optimal choice for emotion detection on the ISEAR dataset. In another investigation (Adoma et al. 2020), a two-stage architecture involving transformers and Bi-LSTM was employed for emotion detection on the Isear dataset. The initial stage included fine-tuning the BERT pre-trained model to extract vector transformations, followed by the second stage, where the extracted vectors were input into a BiLSTM-based classifier. The BiLSTM outperformed BERT, yielding an accuracy of 72.64%. Acheampong et al. (2021) focused on BERT, RoBERTa, and XLNet, conducting 5-fold cross-validation and selecting RoBERTa and XLNet with accuracy values of 0.736 and 0.711, respectively. They trained each model individually on the Isear dataset and created an ensemble model by averaging predictions, demonstrating superior performance with an F1-score of 0.75.
In a unique approach, Zanwar et al. (2022) proposed models leveraging transformer architectures in conjunction with Bidirectional Long Short-Term Memory networks, incorporating 435 psycholinguistic features. Their hybrid models, BERT+PsyLing and RoBERTa+PsyLing, surpassed standard transformer-based baseline models in text-based emotion detection. These studies collectively highlight the evolving landscape of emotion detection models, showcasing the impact of novel architectures, pretrained language models, and hybrid approaches in advancing the accuracy and efficacy of emotion identification in textual data.
In the following, we highlight certain endeavors related to emotions in the Qur’an. It is noteworthy that, as per our investigations, there has been no prior work utilizing artificial intelligence and machine learning systems based on our findings. This novelty underscores the distinctiveness of this paper. Saeedi (2010) considers emotion a negative factor in behaviour. They primarily focus on exploring proper ways to control and manage emotions using the Qur’an. Melli (2010) initially examines the meaning, concept, place, and dimensions of emotions from the perspectives of the Qur’an and psychology. They consider the four primary emotions of anger, joy, sadness, and fear, providing definitions and examining their types, effects, and influencing factors. The thesis then introduces Qur’anic techniques for managing these four emotions. Karami et al. (2020) categorize emotions into positive and negative according to Piaget's theory. This paper analyses emotional management and techniques in both the Qur’an and Western psychology based on Piaget's theory. It further reviews Piaget's perspective in light of Qur’anic foundations for emotion control.
Zomorodi (2012) explores Qur’anic and Hadith-based solutions for emotional management. KavianiArani (2016) delves into "Feeling and Perception from the Qur’anic Perspective," categorizing feelings into external and internal types. Specifically, Kaviani examines the external type, encompassing the five senses, in Qur’anic verses. Hoseini Mohammadabad et al. (2019) compare the role of emotions in ethical education, considering the approaches of "Dell Mashghooli" (engagement of the heart) and the ethical teachings of the Qur’an. They incorporate emotions alongside reason in ethical education, suggesting that emotions are not solely positive factors. Their work aims to establish a solution for training ethical emotions based on religious teachings.
This paper introduces a two-part model for emotion detection, combining the power of RoBERTa, a pretrained language model, with the insights derived from syntactic features. The model consists of two primary sub-models, each handling distinct aspects of input data.
The first sub-model utilizes RoBERTa as the backbone, processing the main text as its input. RoBERTa, known for its robust language understanding capabilities, operates on the original text to extract rich contextual features.
The second sub-model incorporates an embedding layer paired with either Parts of Speech (POS) or Dependency tags of the input text. This dual-channel approach involves sending the original text through RoBERTa and the syntactic information through an embedding layer. The syntactic input then passes through a Bidirectional Long Short-Term Memory (BiLSTM) network, enabling the extraction of intricate patterns from the syntactic features. The extracted features from both channels are concatenated to form a comprehensive feature representation.
The combined feature representation undergoes classification through a Dense layer, resulting in the categorization of the text into one of seven emotion classes. This two-part architecture is visually represented in Figure 1, illustrating the integration of RoBERTa with syntactic features processing through embedding and BiLSTM layers. The synergy between these components aims to enhance the model's ability to capture both contextual and syntactic nuances, leading to improved emotion detection performance.
Figure 1. Proposed Model Architecture for Emotion Detection: The proposed model comprises two main components: (1) a pretrained RoBERTa language model processing the main text input and (2) an embedding layer and Bidirectional Long Short-Term Memory (BiLSTM) network handling Parts of Speech (POS) or Dependency tags of the input. The architecture integrates these two channels, leveraging RoBERTa's contextual understanding and the syntactic insights derived from POS or Dependency tags. The resulting feature representations are concatenated and fed into a Dense layer for accurate classification into seven emotion classes.
In this section, we delve into a comprehensive exploration of the experiments conducted and the subsequent analysis of results. The initial stage in any Natural Language Processing (NLP) task involves the preprocessing of input data. This process encompasses removing duplicates, punctuations, extra spaces, and converting all characters to lowercase. Additionally, contracted forms, such as "isn't," are transformed into their open forms (e.g., "is not"). The Isear dataset comprises 7,666 sentences across seven emotion classes: joy, sadness, anger, fear, disgust, shame, and guilt, labeled from 1 to 7. After preprocessing and duplicate removal, the dataset is reduced to 7,468 samples. The distribution of samples before and after preprocessing is detailed in Table 1.
While our discussion has primarily centered around broader emotions, it is essential to elaborate on specific chosen emotions. In this dedicated subsection, we provide an inventory of selected emotions, based on relevant sources, along with detailed explanations for each.
Table 1. Isear dataset characteristics
The ISEAR dataset encompasses emotional experiences in seven distinct categories:
While these emotions are distinct in the ISEAR dataset, they may overlap and intersect. Emotions are complex and multifaceted, often intertwining and influencing one another. For instance, an individual might experience a combination of anger and sadness in response to a specific event, or joy and guilt might coexist in situations involving conflicting ethical values. Additionally, emotions can be triggered by various stimuli, including events, situations, or thoughts. The ISEAR dataset provides additional context to understand the reported emotional experiences of participants.
In contrast to many emotion detection datasets sourced from social media, often limited to specific topics and applications, the ISEAR dataset captures emotional experiences across diverse individuals in various conditions and contexts. Close to 3000 participants from different cultural backgrounds contributed to this dataset, making it a rich source for training models and evaluating emotional detection techniques. Due to the substantial volume of data in this database, it has been employed in testing and validating the proposed model. The sample labeling of Qur’anic verses based on the emotions introduced in the ISEAR database is presented in Table 2.
Table 2. Emotion Labeling of Qur’anic Verses Based on ISEAR Database Categories. The labeling type: 'T' indicates labeling used in training, and 'P' indicates predictions based on the proposed model.
The subsequent stage involves converting the input to Parts of Speech (POS) or dependency parsing tags. For each sequence, all words are replaced with their respective POS or dependency parsing tags. The NLTK library (Bird et al. 2009) is utilized for POS tagging, while dependency parsing tags are extracted using spaCy (Boyd 2023). To facilitate the recognition of text by a deep learning model, RoBERTa tokenizer is applied to the main text for the first channel. The second channel employs a Keras embedding layer to convert text into dense vectors.
The maximum sequence length for all inputs is set to 201, and to ensure uniformity, sequences are padded to a length of 210. Labels are mapped from 1 to 7. The RoBERTa-base model from Hugging Face is selected as the pretrained model, featuring 124 million trainable parameters in 12 encoder layers with a hidden size of 768. The dataset is randomly shuffled, with 20% reserved for testing and 80% for training.
Training is completed in four epochs. In the first two epochs, the learning rate is set to 2e-5 with a decay of 1e-6, while the subsequent two epochs use a learning rate of 2e-7. Adam is employed as the optimizer, with an embedding layer dimension of 6, a batch size of 16, and 64 units in the BiLSTM layer. Model evaluation utilizes precision, recall, F1, and accuracy metrics. The classification report for the proposed model is presented in Tables 3 and 4.
To extend the application of the proposed model to detect emotions and feelings in the Qur'an, the Itani English translation is chosen. Verses from the Maryam chapter are manually labeled based on the emotion labels from the Isear dataset. This labeled dataset is then used to train and evaluate the model's effectiveness. Three models—RoBERTa, BERT, and XLNet—with higher accuracy on the Isear dataset are selected for emotion detection in the Holy Qur'an. Results are detailed in Tables 5-7, with comprehensive outcomes for all Qur’anic verses presented in Table 8.
Table 3. Classification report of Roberta with POS embedding.
Table 4. Classification report of Roberta with dependency embedding.
Table 5. The results of the emotions detection on the Maryam chapter with Roberta model.
Table 6. The results of the emotions detection on the Maryam chapter with BERT model.
Table 7. The results of the emotions detection on the Maryam chapter with XLNet model.
Table 8. The number of verses in each of the seven labels of emotion (the total number of verses is 6237)
In the figures 2-4, the graph of the percentage of emotions of the Qur’an is shown based on the above models. In the following, we present the results categorized based on the Meccan and Medinan chapters. The classification of chapters into Meccan and Medinan is based on information obtained from the Shia Wiki website. Accordingly, the total number of verses in Medinan chapters is 1623, while the total number of verses in Meccan chapters is 4613. The obtained results are illustrated graphically in Figures 5 and 6, and the detailed statistics are provided in Table 9.
Figure 2. Emotion Detection from Qur'an text with Roberta model.
Figure 3. Emotion Detection from Qur'an text with BERT model.
Figure 4. Emotion Detection from Qur'an text with XLNet model.
Figure 5. Distribution of Emotions in Medinan Chapters of the Qur’an.
Figure 6. Distribution of Emotions in Meccan Chapters of the Qur’an.
Table 9. Emotion Distribution in Meccan and Medinan Chapters. This table reported the distribution of identified emotions in the verses of Meccan and Medinan chapters of the Qur’an.
As evident from the graphical and tabular representations, despite variations in the number of verses for each emotion label, joy consistently dominates the emotional content of Qur’anic verses.
In this section, we conducted a comparative analysis between the results obtained from the proposed model and those from other transformer-based models using the Isear dataset. The comparison outcomes are detailed in Tables 10 and 11. Notably, since joy is the sole positive emotion in the Isear dataset, it consistently exhibits superior results in all models. Recognition of joy is facilitated by its straightforward and sincere expression, often devoid of sarcasm or irony. For the fear label, the RoBerta-dep model demonstrates superior performance. The dataset's most challenging predictions pertain to anger vs. disgust and shame vs. guilt. The data within these two pairs are closely aligned, posing difficulty even for human interpretation in certain cases. However, the inclusion of POS and dependency parsing tags has notably reduced the misclassification rate between these two classes. The best results are observed for the shame and guilt classes, with 0.74% accuracy for guilt in RoBerta-POS and 0.67% for shame in RoBerta-dep, representing the optimal outcomes among all available models.
Incorporating both POS and dependency tags enhances the model compared to the RoBerta baseline on this dataset. RoBerta-dep achieves the highest performance metrics, with 77% accuracy, and the highest precision, recall, and F1 scores among all models. RoBerta-POS, with 76% accuracy and F1, ranks second in performance.
Furthermore, after evaluating the newly proposed model, we applied it to detect emotions in the text of the Qur’an. The results indicate that the models yield the most accurate outcomes for the joy label. Figure 7 illustrates a linear graph depicting the fluctuation of joy in the largest chapter of the Qur’an. Notably, verses 12 to 30 exhibit minimal joy as they predominantly describe the condition of hypocrites and disbelievers, whereas the surrounding verses, where joy peaks, elaborate on the qualities of the believers. The consistency of these results with the context of the Holy Qur’an is evident.
Table 10. Comparison of proposed model with other models in joy, fear, anger and sadness labels.
Table 11. Comparison of proposed model with other models in disgust, shame and guilt labels.
Figure 7. The linear diagram of the change of the joy in the biggest chapter of the Qur'an.
The comparative analysis reveals some interesting insights into the emotion detection models. Notably, the performance variation across different emotions is evident, with joy consistently outperforming other sentiments. The straightforward and sincere nature of joyous expressions contributes to its higher recognition accuracy compared to more nuanced emotions. The success of the RoBerta-dep model in fear detection highlights the significance of syntactic features, specifically dependency parsing tags, in capturing subtle nuances associated with fear. This suggests that understanding the structural relationships between words and phrases can enhance the model's ability to distinguish between emotions.
The notable improvement achieved by RoBerta-dep, with the highest accuracy, precision, recall, and F1 scores, suggests the efficacy of combining dependency parsing tags with transformer-based models. This hybrid approach appears promising for enhancing emotion detection accuracy, particularly in datasets with closely related emotion categories.
The application of the proposed model to detect emotions in the Qur’an introduces a new dimension to the study. The alignment of results with the contextual themes of the Qur’anic verses, as illustrated in Figure 7, emphasizes the model's ability to capture and interpret emotions in a religious and spiritual context. This alignment with the scriptural context adds a layer of credibility to the model's interpretative capabilities, showcasing its potential utility in analyzing texts with deeper cultural and spiritual significance.
In summary, the comprehensive evaluation of the proposed model on the Isear dataset, coupled with its application to Qur’anic text, highlights its potential for nuanced emotion detection. The integration of syntactical features, especially POS and dependency parsing tags, proves valuable in refining emotion recognition across a spectrum of sentiments. These findings contribute to advancing the understanding and application of emotion detection models in diverse contexts, from psychological datasets to religious texts.
The primary objective of this paper was to advance the state-of-the-art in emotion detection models applied to textual data, with a subsequent application of these models to discern emotions within Qur’anic verses. Among the various models considered, RoBerta emerged as the most effective for this task. The incorporation of syntactic features, specifically Parts of Speech (POS) and dependency parsing tags, played a pivotal role in refining the models. Through this approach, we achieved significant enhancements, culminating in the development of a model with an impressive 77% accuracy on the dataset.
Furthermore, the augmentation of deep learning models with external features, such as syntactic information, serves to enhance their generalizability. While deep learning excels at automatic feature extraction, its applicability is often closely tied to specific domains. The inclusion of external features mitigates this dependency, leading to increased accuracy and efficiency across diverse datasets.
The application of our proposed model to the English translation of the Qur’an yielded noteworthy and unexpected findings. In contrast to prevailing opinions characterizing the Qur’an as a somber text, the results revealed that joy and happiness are predominant emotions expressed in its verses. Importantly, this conclusion was derived systematically and autonomously, devoid of human bias. These findings underscore the potential of advanced emotion detection models, not only for enhancing accuracy within specific domains but also for challenging preconceived notions and for fostering a nuanced understanding of complex texts like the Qur’an.