Information extraction is finding and extracting pre-determined desired information from Web pages and other text documents. For example, the constantly changing contact information from thousands of companies (say, primary and secondary business phone numbers, email addresses, names and titles of executives, etc.) could be obtained by using information extraction software that has been trained to find and extract these specific facts and then store them in a relational database. Or, the same software could be trained to find and extract specific employment and educational information (say, employers and dates employed, universities, degrees, and dates attended, etc.) from millions of job applicants' textual resumes. Information extraction software is an important technology that can be employed to unlock and associate valuable information that is embedded in Web pages and text documents. Information completely in textual form is valuable, but in that format it is difficult to perform analysis; sort, search, and filter; or find patterns and trends in key facts embedded in the content. Information extraction products allow their users to wade as often as necessary through massive amounts of unstructured text - whether found on the Internet, company intra/extranets, or in legacy document collections - to find and extract just the specific embedded information that is required. This enables the creation and maintenance of fresh databases of dynamically-changing, topic-specific facts that have been extracted and associated from multiple documents, in multiple formats, in multiple locations. I welcome any coders questions.
## Deliverables
Complete source code of all programming work done