Abstract
This paper describes the VGG-Seq2Seq system for the Medical Domain Visual Question Answering (VQA-Med) Task of ImageCLEF 2018. The proposed system follows the encoder-decoder architecture, where the encoders fuses a pretrained VGG network with an LSTM network that has a pretrained word embedding layer to encode the input. To generate the output, another LSTM network is used for decoding. When used with a pretrained VGG network, the VGG-Seq2Seq model managed to achieve reasonable results with 0.06, 0.12, 0.03 BLEU, WBSS and CBSS, respectively. Moreover, the VGG-Seq2Seq is not expensive to train.
| Original language | English |
|---|---|
| Journal | CEUR Workshop Proceedings |
| Volume | 2125 |
| State | Published - 2018 |
| Externally published | Yes |
| Event | 19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018 - Avignon, France Duration: 10 Sep 2018 → 14 Sep 2018 |
Keywords
- Global Vectors for Word Representation
- Sequence to sequence
- VGG Network
Fingerprint
Dive into the research topics of 'JUST at VQA-Med: A VGG-Seq2Seq model'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver