Main content area

An influent responsive control strategy with machine learning: Q-learning based optimization method for a biological phosphorus removal system

Pang, Ji-Wei, Yang, Shan-Shan, He, Lei, Chen, Yi-Di, Cao, Guang-Li, Zhao, Lei, Wang, Xin-Yu, Ren, Nan-Qi
Chemosphere 2019 v.234 pp. 893-901
activated sludge, algorithms, artificial intelligence, chemical oxygen demand, models, phosphorus, system optimization, wastewater, wastewater treatment
Biological phosphorus removal (BPR) is an economical and sustainable processes for the removal of phosphorus (P) from wastewater, achieved by recirculating activated sludge through anaerobic and aerobic (An/Ae) processes. However, few studies have systematically analyzed the optimal hydraulic retention times (HRTs) in anaerobic and aerobic reactions, or whether these are the most appropriate control strategies. In this study, a novel optimization methodology using an improved Q-learning (QL) algorithm was developed, to optimize An/Ae HRTs in a BPR system. A framework for QL-based BPR control strategies was established and the improved Q function, Qt+1(st,st+1)=Qt(st,st+1)+k·[R(st,st+1)+γ·maxatQt(st,st+1)−Qt(st,st+1)] was derived. Based on the improved Q function and the state transition matrices obtained under different HRT step-lengths, the optimum combinations of HRTs in An/Ae processes in any BPR system could be obtained, in terms of the ordered pair combinations of the <current state-transition state>. Model verification was performed by applying six different influent chemical oxygen demand (COD) concentrations, varying from 150 to 600 mg L−1 and influent P concentrations, varying from 12 to 30 mg L−1. Superior and stable effluent qualities were observed with the optimal control strategies. This indicates that the proposed novel QL-based BPR model performed properly and the derived Q functions successfully realized real-time modelling, with stable optimal control strategies under fluctuant influent loads during wastewater treatment processes.