Combinations of graph invariants and attributes of simplified molecular input-line entry system (SMILES) to build up models for sweetness

Achary, P.G.R., Toropova, A.P., Toropov, A.A.
Food research international 2019 v.122 pp. 40-46
computer software, data collection, models, quantitative structure-activity relationships, sweetness
The quantitative structure – activity relationships (QSARs) for sweetness value (log S) were built with a dataset of 315 molecules; following a novel criterion of ‘Index of Ideality of Correlation(IIC)’ This criterion of IIC is available in the latest version of the CORAL software ( The descriptor used in the model building for log S is a hybrid optimal descriptor; obtained by combining the two descriptors: (i) molecular graph based descriptor derived from correlation weights of molecular features and (ii) descriptor derived from the simplified molecular input-line entry system (SMILES) code of sweetener molecule. The data set of 315 molecules was divided into four random splits. The four QSAR models which were build for log S using the criterion of IIC were compared with four similar models built “traditional protocol” described elsewhere. The comparison revealed that the models built using IIc were better with statistical performance.