Ming-Sheng Ying, Yuan Feng, Sheng-Gang Ying. Optimal Policies for Quantum Markov Decision Processes[J]. Machine Intelligence Research, 2021, 18(3): 410-421. DOI: 10.1007/s11633-021-1278-z
Citation: Ming-Sheng Ying, Yuan Feng, Sheng-Gang Ying. Optimal Policies for Quantum Markov Decision Processes[J]. Machine Intelligence Research, 2021, 18(3): 410-421. DOI: 10.1007/s11633-021-1278-z

Optimal Policies for Quantum Markov Decision Processes

  • Markov decision process (MDP) offers a general framework for modelling sequential decision making where outcomes are random. In particular, it serves as a mathematical framework for reinforcement learning. This paper introduces an extension of MDP, namely quantum MDP (qMDP), that can serve as a mathematical model of decision making about quantum systems. We develop dynamic programming algorithms for policy evaluation and finding optimal policies for qMDPs in the case of finite-horizon. The results obtained in this paper provide some useful mathematical tools for reinforcement learning techniques applied to the quantum world.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return