In information theory, given an unknown stationary source π with alphabet A and a sample w from π, the Krichevsky–Trofimov (KT) estimator produces an estimate pi(w) of the probability of each symbol i ∈ A.
This estimator is optimal in the sense that it minimizes the worst-case regret asymptotically.
For a binary alphabet and a string w with m zeroes and n ones, the KT estimator pi(w) is defined as:[1] This corresponds to the posterior mean of a Beta-Bernoulli posterior distribution with prior
For the general case the estimate is made using a Dirichlet-Categorical distribution.
This probability-related article is a stub.