%0 Conference Proceedings %T Automatic In-the-wild Dataset Annotation with Deep Generalized Multiple Instance Learning %A Correia, Joana %A Trancoso, Isabel %A Raj, Bhiksha %Y Calzolari, Nicoletta %Y Béchet, Frédéric %Y Blache, Philippe %Y Choukri, Khalid %Y Cieri, Christopher %Y Declerck, Thierry %Y Goggi, Sara %Y Isahara, Hitoshi %Y Maegaard, Bente %Y Mariani, Joseph %Y Mazo, Hélène %Y Moreno, Asuncion %Y Odijk, Jan %Y Piperidis, Stelios %S Proceedings of the Twelfth Language Resources and Evaluation Conference %D 2020 %8 May %I European Language Resources Association %C Marseille, France %@ 979-10-95546-34-4 %G English %F correia-etal-2020-automatic %X The automation of the diagnosis and monitoring of speech affecting diseases in real life situations, such as Depression or Parkinson’s disease, depends on the existence of rich and large datasets that resemble real life conditions, such as those collected from in-the-wild multimedia repositories like YouTube. However, the cost of manually labeling these large datasets can be prohibitive. In this work, we propose to overcome this problem by automating the annotation process, without any requirements for human intervention. We formulate the annotation problem as a Multiple Instance Learning (MIL) problem, and propose a novel solution that is based on end-to-end differentiable neural networks. Our solution has the additional advantage of generalizing the MIL framework to more scenarios where the data is stil organized in bags but does not meet the MIL bag label conditions. We demonstrate the performance of the proposed method in labeling the in-the-Wild Speech Medical (WSM) Corpus, using simple textual cues extracted from videos and their metadata. Furthermore we show what is the contribution of each type of textual cues for the final model performance, as well as study the influence of the size of the bags of instances in determining the difficulty of the learning problem %U https://aclanthology.org/2020.lrec-1.435 %P 3542-3550