國內學術電子期刊系統 Electronic Journal System of STPI

作者： 樊國楨; 張小鳳
作者服務機構： 工業技術研究院; 淡江大學
中文摘要： 馬可夫決策過程係在一有限的狀態空間及決策空間下，對一序列間斷的隨機過程進行決策分析。本文描述折現的單一目標馬可夫決策過程及單一目標完全遍歷過程所俱有之特性以及分別在β折最佳化及平均最佳化之準則下的求解方法；並證明有效定常策略集合俱有連結集合之特性，及分別將折現的單一目標馬可夫決策過程以及單一目標完全遍歷過程之求解方法推廣至折現的向量極大化馬可夫決策過程及向量極大化完全遍歷過程中，分別求得一尋找在折現下之有效定常策略集合及在平均準則下之有效定常策略集合之方法，而從釋例中更可見到其效率。
英文摘要： Markovian Decision Process is usedto make decision for a sequential dis-crete stochastic process under finite statespace and decision space. In this thesis,It was described that the characteristicsand algorithms of both Discounted SingleObjective Markovian Decision Processand Single Objective Completely ErgodicProcess. At the same time, we provedboth the set of the average efficientpure stationary stragegy and the β-efficient pure stationary strategy whichare connected sets. Therefore, we couldextent Discounted Single Objective Mar-kovian Decision process to DiscountedVector Maximum Markovian DecisionProcess and Single Objective CompletelyErgodic Process to Vector MaximumCompletely Ergodic Process separately.The algorithms were investigated forfinding the β-efficient pure stationarystrategy and the efficient pure station-ary strategy under average criterion.In the last part of the thesis, the ad-vantages of the algorithm were provedby using the example.
中文關鍵字： --
英文關鍵字： --