Go to login Go to sub menu Go to text

Course summary

  • Type MOOC course
  • Period Always open
  • Learning Time Study freely
  • Course approval method Automatic approval
  • Certificate Issue Online
http://kaist.edwith.org/reinforcement
둘러보기
Thumb up 0 Learner 13

Instructor Introduction

  • KAIST 산업및시스템공학과 신하용 교수님

    교수자 : 신하용 

    2001-현재 : KAIST 산업및시스템공학과 교수
    1991~2001 : LG전자, ㈜큐빅테크, Chrysler(미) 연구원
    대한산업공학회 부회장(저널), 정헌학술대상 수상 (2021)
    한국CDE학회 수석부회장, 가헌학술상 수상 (2002, 2005, 2009)
    Computer-Aided Design 저널 Editorial board member(2005~)

Lecture plan

강의
  1. 1. Introduction
    1. 강화학습이란?
    1. 강화학습의 특징과 사례
    1. 동적 시스템
    1. 강화학습 구성요소
    1. Quiz 1
  2. 2. Markov Decision Process
    1. Markov Chain
    1. Markov Reward Process
    1. Markov Decision Process
    1. Quiz 2
  3. 3. Dynamic Programming
    1. Dynamic programming?
    1. Policy evaluation
    1. Optimal policy
    1. Asynchronous DP
    1. Quiz 3
  4. 4. Monte Carlo methods
    1. Monte Carlo method 개요
    1. Stochastic approximation
    1. MC policy evaluation
    1. MC control
    1. Quiz 4
  5. 5. Temporal difference methods
    1. TD learning 개요
    1. TD control
    1. Q learning
    1. Double Q learning
    1. Quiz 5
  6. 6. n-Step TD methods
    1. n-step return
    1. TD(λ) policy evaluation
    1. Eligibility trace와 TD control
    1. Q(λ) algorithm
    1. Quiz 6
  7. 7. Value function approximation
    1. Value function approximation 개요
    1. Features for VFA
    1. Application of VFA : Cartpole
    1. Linear VFA for Cartpole
    1. Quiz 7