In a predictive action determination apparatus (10), a state observation
section (12) observes a state with respect to an environment (11) and
obtains state data s(t). An environment prediction section (13) predicts,
based on the state data s(t), a future state change in the environment. A
target state determination section (15) determines, as a target state, a
future state suitable for action determination with reference to a state
value storage section (14). A prediction-based action determination
section (16) determines an action based on a determined target state.