Lecture video segmentation dataset

Created by: D. Galanopoulos (CERTH); V. Mezaris (CERTH)

Description: A large-scale lecture video dataset consisting of artificially-generated lectures, and the corresponding ground-truth fragmentation, for the purpose of evaluating lecture video fragmentation techniques is published under an open source license in GitHub. The dataset is also published and in Zenodo repository. For creating this dataset, 1498 speech transcript files (generated automatically by ASR software) were used from the world’s biggest academic online video repository, the VideoLectures.NET. These transcripts correspond to lectures from various fields of science, such as Computer science, Mathematics, Medicine, Politics etc.

Available at: @Zenodo, @GitHub

 

Initial evaluation of the MOVING platform

Created by: A. Apaolaza (UMAN)

Description: This resource contains information about the laboratory study carried out as an initial evaluation of the MOVING platform. It contains the information gathered from the questionnaires, and qualitative notes taken during the study for two different use cases. The design of the study and the results of the analysis can be found in deliverable D1.3: Initial evaluation, updated requirements and specifications

Available at: @Zenodo

 

Concept detection scores for the MED16train dataset (TRECVID MED task)

Created by: D. Galanopoulos (CERTH), F. Markatopoulou (CERTH), V. Mezaris (CERTH)

Description: We provide concept detection scores for the MED16train dataset which is used at the TRECVID Multimedia Event Detection (MED) task [1]. First, each video is decoded into a set of keyframes at fixed temporal intervals (2 keyframes per second). Then, we calculated concept detection scores for the two following concept sets: i) 487 sport-related concepts from YouTube Sports-1M Dataset[1] and ii) 345 TRECVID SIN concepts [3]. The scores have been generated as follows:
1) For the 487 concepts for the Sports-1M Dataset, a Googlenet network [4] originally trained on 5055 ImageNet concepts was fine-tuned, following the extension strategy of [2] with one extension layer of dimension 128.
2) For the 345 TRECVID SIN concepts, a pre-trained Googlenet network [4] on 5055 ImageNet concepts was fine-tuned on these concepts, again following the extension strategy of [2] with one extension layer of dimension 1024.

After unpacking the compressed file two different folders can be found, namely “Prob_sports_MED16train” and “Prob_SIN_MED16train”, one for each concept set. We provide one file for every video of the MED16train dataset for each concept set. Each file consists of N columns (where N = 345 for TRECVID SIN and N = 487 for Sports-1M Dataset) and M rows (where M is the number of extracted keyframes for the corresponding video). Each column corresponds to a different concept, with all concept scores being in the range [0,1]. The higher the score the more likely that the corresponding concept appears in the keyframe. Two additional files are provided; files “sports_487_Classes.txt” and “SIN_345_Classes.txt” indicate the order of the concepts that is used in the concept score files.

[1] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar and L. Fei-Fei, “Large-scale video classification with convolutional neural networks”, In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1725-1732, 2014.

[2] N. Pittaras, F. Markatopoulou, V. Mezaris and I. Patras, “Comparison of Fine-tuning and Extension Strategies for Deep Convolutional Neural Networks”, Proc. 23rd Int. Conf. on MultiMedia Modeling (MMM’17), Reykjavik, Iceland, Springer LNCS vol. 10132, pp. 102-114, Jan. 2017.

[3] G. Awad, C. Snoek, A. Smeaton, and G. Quénot, “TRECVid semantic indexing of video: a 6-year retrospective”, ITE Transactions on Media Technology and Applications, 4 (3). pp. 187-208, 2016.

[4] C. Szegedy, Wei Liu, Yangqing Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich, “Going deeper with convolutions”, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9, 2015.

Available at: @Zenodo

 

IMPULSE: Integrate Public Metadata Underneath professional Library SErvices

Created by: S. Lüdeke, T. Blume, A. Scherp (ZBW)

Description: This repository contains the results of our experimental evaluation conducted with IMPULSE. The datasets are subsets of the Billion Triple Challenge Dataset 2014 containing only bibliographic metadata.

Available at: @Zenodo