Deep learning in photoacoustic imaging

[1] Similar to other computed tomography methods, the sample is imaged at multiple view angles, which are then used to perform an inverse reconstruction algorithm based on the detection geometry (typically through universal backprojection,[3] modified delay-and-sum,[4] or time reversal [5][6]) to elicit the initial pressure distribution within the tissue.

After this initial implementation, the applications of deep learning in PACT have branched out primarily into removing artifacts from acoustic reflections,[9] sparse sampling,[10][11][12] limited-view,[13][14][15] and limited-bandwidth.

In Reiter et al.,[8] a convolutional neural network (similar to a simple VGG-16 [21] style architecture) was used that took pre-beamformed photoacoustic data as input and outputted a classification result specifying the 2-D point source location.

[19] The results of the network show improvements over standard delay-and-sum or frequency-domain beamforming algorithms and Johnstonbaugh proposes that this technology could be used for optical wavefront shaping, circulating melanoma cell detection, and real-time vascular surgeries.

Then, a convolutional neural network (CNN) is trained to remove the artifacts, in order to produce an artifact-free representation of the ground truth initial pressure distribution.

Sparse sampling typically occurs as a way of keeping production costs low and improving image acquisition speed.

[17] When a region of partial solid angles are not captured, generally due to geometric limitations, the image acquisition is said to have limited-view.

[24] As illustrated by the experiments of Davoudi et al.,[12] limited-view corruptions can be directly observed as missing information in the frequency domain of the reconstructed image.

Prior to deep learning, the limited-view problem was addressed with complex hardware such as acoustic deflectors[25] and full ring-shaped transducer arrays,[12][26] as well as solutions like compressed sensing,[27][28][29][30][31] weighted factor,[32] and iterative filtered backprojection.

[14] Guan et al.[36] was able to apply a FD U-net to remove artifacts from simulated limited-view reconstructed PA images.

The network was able to remove artifacts created in the time-reversal process from synthetic, mouse brain, fundus, and lung vasculature phantoms.

[36] This pixel-wise interpolation method was significantly faster and had comparable PSNR and SSIM than the images reconstructed from the computationally intensive iterative approach.

[15][16] The typical method to remove artifacts and denoise limited-bandwidth reconstructions before deep learning was Wiener filtering, which helps to expand the PA signal's frequency spectrum.

The trained network was able to increase the peak-to-background ratio by 4.19 dB and penetration depth by 5.88% for photos created by the low energy laser of an in vivo sheep brain.

The two primary motion artifact types addressed by deep learning in PAM are displacements in the vertical and tilted directions.

[37] Frequency-domain PAM constitutes a powerful cost-efficient imaging method integrating intensity-modulated laser beams emitted by continuous wave sources for the excitation of single-frequency PA signals.

[38] Nevertheless, this imaging approach generally provides smaller signal-to-noise ratios (SNR) which can be up to two orders of magnitude lower than the conventional time-domain systems.

[39] To overcome the inherent SNR limitation of frequency-domain PAM, a U-Net neural network has been utilized to augment the generated images without the need for excessive averaging or the application of high optical power on the sample.

In this context, the accessibility of PAM is improved as the system's cost is dramatically reduced while retaining sufficiently high image quality standards for demanding biological observations.