My current research interests lie in the area of Information Extraction from the WWW. To extract information from the Web, it is necessary to locate and identify entities of information on Web pages. Check out the online
VENTex and the older
VENTrec System to test our Java-based Table-Detection algorithm. After detecting tables (without looking at the implementation), the next step is the Table Recognition. This means that of all returned tables only those tables which may contain "important" information are filtered.