Detecting emotional information usually begins with passive sensors that capture data about the user's physical state or behavior without interpreting the input.
A more practical approach, based on current technological capabilities, is the simulation of emotions in conversational agents in order to enrich and facilitate interactivity between human and machine.
[11] In psychology, cognitive science, and in neuroscience, there have been two main approaches for describing how humans perceive and classify emotion: continuous or categorical.
Various changes in the autonomic nervous system can indirectly alter a person's speech, and affective technologies can leverage this information to recognize emotion.
Vocal parameters and prosodic features such as pitch variables and speech rate can be analyzed through pattern recognition techniques.
[20] The process of speech/text affect detection requires the creation of a reliable database, knowledge base, or vector space model,[21] broad enough to fit every need for its application, as well as the selection of a successful classifier which will allow for quick and accurate emotion identification.
This creates one of the biggest challenges in detecting emotions based on speech, as it implicates choosing an appropriate database used to train the classifier.
Moreover, data obtained in a natural context has lower signal quality, due to surroundings noise and distance of the subjects from the microphone.
[26][27] Likewise, producing one standard database for all emotional research would provide a method of evaluating and comparing different affect recognition systems.
It is, therefore, crucial to select only the most relevant features in order to assure the ability of the model to successfully identify emotions, as well as increasing the performance, which is particularly significant to real-time detection.
[22] It is crucial to identify those that are redundant and undesirable in order to optimize the system and increase the success rate of correct emotion detection.
By doing cross-cultural research in Papua, New Guinea, on the Fore Tribesmen, at the end of the 1960s, Paul Ekman proposed the idea that facial expressions of emotion are not culturally determined, but universal.
Psychologists have proposed the following classification of six basic emotions, according to their action units ("+" here mean "and"): As with every computational practice, in affect detection by facial processing, some obstacles need to be surpassed, in order to fully unlock the hidden potential of the overall algorithm or method employed.
As hardware evolves, as more data are collected and as new discoveries are made and new practices introduced, this lack of accuracy fades, leaving behind noise issues.
[35][36] Other challenges include Gestures could be efficiently used as a means of detecting a particular emotional state of the user, especially when used in conjunction with speech and face recognition.
Depending on the specific action, gestures could be simple reflexive responses, like lifting your shoulders when you don't know the answer to a question, or they could be complex and meaningful as when communicating with sign language.
[39] The foremost method makes use of 3D information of key elements of the body parts in order to obtain several important parameters, like palm position or joint angles.
If the subject experiences fear or is startled, their heart usually 'jumps' and beats quickly for some time, causing the amplitude of the cardiac cycle to increase.
As the subject calms down, and as the body's inner core expands, allowing more blood to flow back to the extremities, the cycle will return to normal.
As the sweat glands are activated, even before the skin feels sweaty, the level of the EDA can be captured (usually using conductance) and used to discern small changes in autonomic arousal.
To maximize comfort and reduce irritation the electrodes can be placed on the wrist, legs, or feet, which leaves the hands fully free for daily activity.
Computer scientists at Penn State treat the challenge of automatically inferring the aesthetic quality of pictures using their visual content as a machine learning problem, with a peer-rated on-line photo sharing website as a data source.
In education, the teacher can use the analysis result to understand the student's learning and accepting ability, and then formulate reasonable teaching plans.
Especially in distance education, due to the separation of time and space, there is no emotional incentive between teachers and students for two-way communication.
For example, a car can monitor the emotion of all occupants and engage in additional safety measures, such as alerting other vehicles if it detects the driver to be angry.
[53] A particularly simple form of biofeedback is available through gamepads that measure the pressure with which a button is pressed: this has been shown to correlate strongly with the players' level of arousal;[54] at the other end of the scale are brain–computer interfaces.
[59] One idea put forth by the Romanian researcher Dr. Nicu Sebe in an interview is the analysis of a person's face while they are using a certain product (he mentioned ice cream as an example).
One could also use affective state recognition in order to judge the impact of a TV advertisement through a real-time video recording of that person and through the subsequent study of his or her facial expression.
Averaging the results obtained on a large group of subjects, one can tell whether that commercial (or movie) has the desired effect and what the elements which interest the watcher most are.
[4] In contrast, the interactional approach seeks to help "people to understand and experience their own emotions"[62] and to improve computer-mediated interpersonal communication.