An apparatus and method for efficiently constructing learning data required in statistical methodology used in information retrieval, information extraction, translation, natural language processing, etc. are provided. The method includes the steps of: generating learning models by performing machine learning with respect to learning data; attaching tags to a raw corpus automatically by using the generated learning models to thereby generate learning data candidates; calculating confidence scores of the generated learning data candidates, and then selecting a learning data candidate using the confidence scores; and allowing a user to correct an error in the selected learning data candidate through an interface and adding the error-corrected learning data candidate to the learning data, thereby adding new learning models incrementally.

 
Web www.patentalert.com

< Method of measuring a large population of web pages for compliance to content standards that require human judgement to evaluate

< Method and apparatus providing programmable network intelligence

> Systems and methods for dynamic detection and prevention of electronic fraud

> Gene expression programming based on Hidden Markov Models

~ 00609