Spreading excellence and disseminating the cutting edge results of our research and development efforts is crucial to our institute. Check for our educational offers for Bachelor, Master and PhD studies at the University of Innsbruck!
This thesis is for a master student with good knowledge and interest in web programming. This thesis begins with looking at the tool allowing to collect data from web pages to reuse it in other applications. This task is known as web scrapping. Beautiful Soup, and its equivalents allow to parse web page and capture fields of interest.
The goal of the thesis is then to define a visual web scrapper, possibly based on some programmable tools allowing to visually capture the web data on web pages, but also in forms. The captured data should then be related to an RDF document specifying its meaning, thus providing semantic annotations.