A general-purpose knowledge finding method for efficient knowledge finding by
selectively
sampling only data in large information amounts from a database. Learning means
104 causes a lower-order learning algorithm, inputted via an input unit
107, to perform learning on plural partial samples generated by sampling
from data stored in a high-speed main memory 120, to obtain plural hypotheses.
Data selection means 105 uses the hypotheses to estimate information amounts
of respective candidate data read from a large-capacity data storage device 130,
and additionally stores only data in large information amounts into the high-speed
main memory 120. A control unit 106 repeats the processing a predetermined
number of times, and stores obtained final hypotheses. A prediction unit 102
predicts a label value of unknown-labeled data inputted into the input unit 107
by the final hypotheses, and an output unit 101 outputs the predicted value.