The question does not have an obvious answer because it poses both theoretical and observational challenges similar to those that had arisen in constructing the foundations of quantum physics.
The lambda calculus of Alonzo Church can be viewed as the earliest message passing programming language (see Hewitt, Bishop, and Steiger 1973; Abelson and Sussman 1985).
For example, the lambda expression below implements a tree data structure when supplied with parameters for a leftSubTree and rightSubTree.
Simula 67 pioneered using message passing for computation, motivated by discrete event simulation applications.
Alan Kay was influenced by message passing in the pattern-directed invocation of Planner in developing Smalltalk-71.
In 1972 Kay visited MIT and discussed some of his ideas for Smalltalk-72 building on the Logo work of Seymour Papert and the "little person" model of computation used for teaching children to program.
Also, although the system was bootstrapped on itself, the language constructs were not formally defined as objects that respond to Eval messages (see discussion below).
Subsequent versions of the Smalltalk language largely followed the path of using the virtual methods of Simula in the message-passing structure of programs.
The authors of Simula had considered making such primitives into objects but refrained largely for efficiency reasons.
The Smalltalk system went on to become very influential, innovating in bitmap displays, personal computing, the class browser interface, and many other ways.
[1] Meanwhile, the Actor efforts at MIT remained focused on developing the science and engineering of higher level concurrency.
(See the paper by Jean-Pierre Briot for ideas that were developed later on how to incorporate some kinds of Actor concurrency into later versions of Smalltalk.)
Despite these apparent difficulties, Petri nets continue to be a popular approach to modelling concurrency, and are still the subject of active research.
Prior to the Actor model, concurrency was defined in low-level machine terms of threads, locks and buffers(channels).
Gerald Sussman and Guy Steele then took an interest in Actors and published a paper on their Scheme interpreter in which they concluded "we discovered that the 'actors' and the lambda expressions were identical in implementation."
Russ Atkinson and Carl Hewitt published a paper on specification and proof techniques for serializers providing an efficient solution to encapsulating shared resources for concurrency control.
Finally eight years after the first Actor publication, Will Clinger (building on the work of Irene Greif 1975, Gordon Plotkin 1976, Michael Smyth 1978, Henry Baker 1978, Francez, Hoare, Lehmann, and de Roever 1979, and Milne and Milnor 1979) published the first satisfactory mathematical denotational model incorporating unbounded nondeterminism using domain theory in his dissertation in 1981 (see Clinger's model).
Subsequently, Hewitt [2006] augmented the diagrams with arrival times to construct a technically simpler denotational model that is easier to understand.