It is a statistical modeling approach that uses the similarity and error-based measures as descriptors in addition to the usual structural and physicochemical descriptors, and it has been shown to enhance the external predictivity of QSAR/QSPR models.
This approach utilizes similarity-based considerations yet can generate simple, interpretable, and transferable models.
This approach may be used for any type of structural and physicochemical descriptors and with any modeling algorithms.
[2][3][4][5] Among different RASAR descriptors, RA function, Average Similarity and gm (Banerjee-Roy concordance coefficient) have shown high importance in modeling in some studies.
[5] In 2023, Banerjee-Roy similarity coefficients sm1 and sm2 have also been proposed to identify potential activity cliffs in a data set.