Psychologists consider intrinsic motivation in humans to be the drive to perform an activity for inherent satisfaction – just for the fun or challenge of it.
[6] In contrast to the idea of optimal incongruity, Deci and Ryan identified in the mid 80's an intrinsic motivation based on competence and self-determination.
Intrinsic motivation, in contrast, encourages the agent to first explore aspects of the environment that confer more information, to seek out novelty.
Recent work unifying state visit count exploration and intrinsic motivation has shown faster learning in a video game setting.
[16] The quantification of prediction and novelty to drive behaviour is generally enabled through the application of information-theoretic models, where agent state and strategy (policy) over time are represented by probability distributions describing a markov decision process and the cycle of perception and action treated as an information channel.
The main criticism and difficulty of these models is the intractability of computing probability distributions over large discrete or continuous state spaces.
[33] Despite the impressive success of deep learning in specific domains (e.g. AlphaGo), many in the field (e.g. Gary Marcus) have pointed out that the ability to generalise remains a fundamental challenge in artificial intelligence.
Intrinsically motivated learning, although promising in terms of being able to generate goals from the structure of the environment without externally imposed tasks, faces the same challenge of generalisation – how to reuse policies or action sequences, how to compress and represent continuous or complex state spaces and retain and reuse the salient features that have been learnt.