The American legal system, judges and lawyers are continually researching
an ever-expanding body of past judicial opinions, or case law, for the
ones most relevant to resolution of new disputes. To facilitate these
searches, some companies collect and publish the judicial opinions of
courts across the United States in both paper and electronic forms, with
some of the cases containing references to prior cases from other courts
that have previously ruled on all or part of the same dispute.
Identifying the prior cases is problematic, because, for example,
conventional computer text-matching not only suggests too many non-prior
cases, but also misses too many actual prior cases. Accordingly, the
present inventors devised systems, methods, and software that generally
facilitate identification of one or more documents that are related to a
given document, and particularly facilitate identification of prior cases
for a given case. One specific embodiment retrieves prior-case candidates
based on information extracted from an input case, and then uses a
support vector machine to determine which of the prior-case candidates
are most probably prior cases for the input case.