Reactive oxygen species (ROS) play many roles in the body, such as cell signaling, homeostasis, or protection from harmful bacteria. However, an excess of ROS in the body will damage lipids, proteins, and DNA. Many studies have shown that various environmental factors increase the amount of ROS produced in the body. Antioxidant proteins are responsible for neutralizing these ROS or free radicals. Although the amount of data on protein sequences has increased over the last two decades, we still lack bioinformatics tools to be able to accurately identify antioxidant protein sequences. Furthermore, biochemical methods to determine antioxidant proteins are very expensive and time-consuming. Therefore, a machine learning approach must be used to speed up the computation. In this study, we propose a new method that combines a convolutional neural network and Random Forest using two features, the normalized PSSM and the best-selected feature of the ProtBert output. Our model gave very good results on the independent test dataset with 97.3% sensitivity and 95.9% specificity. Comparison with current state-of-the-art models shows that our model is superior.

Click here for Demo