From Visual Web Scrapping to Semantic Annotations

Student name: 
Stefan Prugger

This thesis is for a master student with good knowledge and interest in web programming. This thesis begins with looking at the tool allowing to collect data from web pages to reuse it in other applications. This task is known as web scrapping. Beautiful Soup, and its equivalents allow to parse web page and capture fields of interest.
The goal of the thesis is then to define a visual web scrapper, possibly based on some programmable tools allowing to visually capture the web data on web pages, but also in forms. The captured data should then be related to an RDF document specifying its meaning, thus providing semantic annotations.