WAIC penalty is then the variance of predictions among these samples, calculated and added for each datapoint from dataset.
This terminology stems from historical conventions, as a similar term is used in the Akaike Information Criterion.
[3] Watanabe recommends in practice calculating both WAIC and PSIS – Pareto Smoothed Importance Sampling.
[3][4] Some textbooks of Bayesian statistics recommend WAIC over other information criteria, especially for multilevel and mixture models.
[6] WBIC is the average log likelihood function over the posterior distribution with the inverse temperature > 1/log n where n is the sample size.
[6] Both WAIC and WBIC can be numerically calculated without any information about a true distribution.