A by-line extraction method detects a set of potential headlines from a title meta-tag of a crawled document, selects a candidate headline from the set of potential headlines, and extracts the by-line information from the document using the location of the selected candidate headline. The method constructs the set of potential headlines based on the title meta-tag. The method selects a candidate headline by evaluating the set of potential headlines in order of the lengths of the potential headlines. The method extracts the by-line information from the document by using the location of the selected candidate headline to extract a string representing a date, a name, or a source located within a minimum distance from the location of the potential headline.

 
Web www.patentalert.com

< System and method for managing login resources for the submission and performance of engagements

> Data fusion for advanced ground transportation system

~ 00482