From a physics of light transport point of view, however, this is an inaccurate model provided the pixel on the sensor plane has non-zero area.
Note that this approach can also represent a lens-based camera and thus depth of field effects, using a cone whose cross-section decreases from the lens size to zero at the focal plane, and then increases.
From a signal processing point of view, ignoring the point spread function and approximating the integral of radiance with a single, central sample (through a ray with no thickness) can lead to strong aliasing because the "projected geometric signal" has very high frequencies exceeding the Nyquist-Shannon maximal frequency that can be represented using the uniform pixel sampling rate.
[2] Conversely, the ideal sinc function is not practical, having infinite support with possibly negative values which often creates ringing artifacts due to the Gibbs phenomenon.
[3] Cone and Beam early papers rely on different simplifications: the first considers a circular section and treats the intersection with various possible shapes.