Individual tools, services and demos
LODatio+ – Searching the Content of the Web of Data. LODatio+ is a search engine to locate data sources in the Web of Data that contain resources of specific types and using specific properties. For a given information need, LODatio+ returns not only the matching data sources but also provides query recommendations to generalize or to narrow down the information need. Users can formulate their information need as a SPARQL query using only types and properties. The example queries provided can help users get familiar with these queries. For each query, LODatio+ provides a ranked list of data sources in the Web of Data that contain data matching the types and properties specified in the query.
LODatio+ is running as part of the MOVING platform. It is maintained by ZBW – Leibniz information center for economics in Kiel. Based on LODatio+, we developed the Data Integration Service for the Web of Data. With LODatio+, we first find data with certain attributes, for example, bibliographic metadata with title, authors, concepts or keywords, an abstract, and a link to the full-text. This helps us to enhance the scientific content available over the MOVING platform by including this freely available, linked open metadata provided by various data sources. First, we discover data sources with LODatio+ providing bibliographic metadata. Then, we access the data sources, retrieve the appropriate metadata, automatically convert it into our internal data format, and integrate it in the MOVING platform. For MOVING, we focused on bibliographic metadata. However, LODatio+, as well as the Data Integration Service, are designed to find, harvest, and convert any kind of data on the Web.
Please find more information in our short video tutorial below or try the public prototype directly.
WevQuery – A scalable system for testing hypotheses about web interaction patterns. Remotely stored user interaction logs give access to a wealth of data generated by large numbers of users which can be used to understand if interactive systems meet the expectations of designers. Unfortunately, detailed insight into users’ interaction behaviour still requires a high degree of expertise. WevQuery allows designers to test their hypotheses about users’ behaviour by using a graphical notation to define the interaction patterns designers are seeking. WevQuery is scalable as the queries can then be executed against large user interaction datasets by employing the MapReduce paradigm. This way WevQuery provides designers effortless access to harvest users’ interaction patterns, removing the burden of low-level interaction data analysis. You can find more information regarding the system and download it under: https://github.com/aapaolaza/WevQuery
SciFiS – A Search Engine for Scientific Figures. Scientific figures like bar charts, pie charts, maps, scatter plots, or similar infographics often include valuable textual information, which is not present in the surrounding text. A new tool developed by the ZBW Knowledge Discovery Research Group enables the search in such infographics and thus offers new ways to access publications. A public prototype allows the search for infographics in open access publications taken from EconBiz. The prototype can be accessed at http://broca.informatik.uni-kiel.de:20080/ and more information about this research can be found at http://www.kd.informatik.uni-kiel.de/en/research/software/text-extraction.
Interactive online demo with audio and video analysis results in lecture and non-lecture videos. CERTH released an interactive online demo linking lecture videos, using general purpose concepts that were produced from textual analysis of their transcripts, with non-lecture videos, using their visual analysis results such as automatically detected shots, scenes, and visual concepts. You can access the demo at: http://multimedia2.iti.gr/moving-project/lecture-video-linking-demo/results.html (best viewed with Firefox).
Scientific Paper Recommendation using Sparse Title Data. The system delivers recommended scientific papers in economics based on what a social media user tweeted. It profiles papers as well as tweets using our novel method HCF-IDF (Hierarchical Concept Frequency Inverse Document Frequency). HCF-IDF extracts semantic concepts from texts and applies spreading activation based on a hierarchical thesaurus, which is freely available in many different domains. Spreading activation enables to extract relevant semantic concepts which are not mentioned in texts and mitigates shortness and sparseness of texts. The novel method HCF-IDF demonstrated the best performance in a larger user experiment published at JCDL’16. In this demo, you may compare the two different configurations, HCF-IDF using only titles of papers and HCF-IDF using both titles and full-texts of papers. Different from the traditional methods, HCF-IDF can provide competitive recommendations already using only titles.
Support Vector Machine with Gaussian Sample Uncertainty (SVM-GSU). SVM-GSU is a maximum margin classifier that deals with uncertainty in data input. More specifically, the SVM framework is reformulated such that each training example can be modeled by a multi-dimensional Gaussian distribution described by its mean vector and its covariance matrix – the latter modeling the uncertainty. We address the classification problem and define a cost function that is the expected value of the classical SVM cost when data samples are drawn from the multi-dimensional Gaussian distributions that form the set of the training examples. Our formulation approximates the classical SVM formulation when the training examples are isotropic Gaussians with variance tending to zero. We arrive at a convex optimization problem, which we solve efficiently in the primal form using a stochastic gradient descent approach. The resulting classifier, which we name SVM with Gaussian Sample Uncertainty (SVM-GSU), is tested on synthetic data and five publicly available and popular datasets; namely, the MNIST, WDBC, DEAP, TV News Channel Commercial Detection, and TRECVID MED datasets. Experimental results verify the effectiveness of the proposed method.
Tzelepis, V. Mezaris, I. Patras, “Linear Maximum Margin Classifier for Learning from Uncertain Data”, IEEE Transactions on Pattern Analysis and Machine Intelligence, accepted for publication. DOI:10.1109/TPAMI.2017.2772235. An alternate preprint version is available here.
Code available at: https://github.com/chi0tzp/svm-gsu
Related tools and demos by the MOVING partners
Text Extraction from Scholarly Figures. Scholarly figures are data or visualizations like bar charts, pie charts, line graphs, maps, scatter plots or similar figures. Text extraction from scholarly figures is useful in many application scenarios, since text in scholarly figures often contains information that is not present in the surrounding text. We derived a generic pipeline for text extraction from the analysis of the wide research area on text extraction from figures and implemented in total over 20 methods for the six sequential steps of the pipeline.
Interactive on-line video analysis service lets you upload videos via a web interface, and it performs shot/scene segmentation and visual concept detection (several times faster than real-time; uses our new concept detection engine). Results are displayed in an interactive user interface, which allows navigating through the video structure (shots, scenes), viewing the concept detection results for each shot, and searching by concepts within the video. Try this service now!