MERCon - 2022

Permanent URI for this collectionhttp://192.248.9.226/handle/123/18494

Browse

Now showing 1 - 1 of 1

item: Conference-Full-text
Adversarial learning to improve question image embedding in medical visual question answering
(IEEE, 2022-07) Silva, K; Maheepala, T; Tharaka, K; Ambegoda, TD; Rathnayake, M; Adhikariwatte, V; Hemachandra, K
Visual Question Answering (VQA) is a computer vision task in which a system produces an accurate answer to a given image and a question that is relevant to the image. Medical VQA can be considered as a subfield of general VQA, which focuses on images and questions in the medical domain. The VQA model’s most crucial task is to learn the question-image joint representation to reflect the information related to the correct answer. Medical VQA remains a difficult task due to the ineffectiveness of question-image embeddings, despite recent research on general VQA models finding significant progress. To address this problem, we propose a new method for training VQA models that utilizes adversarial learning to improve the question-image embedding and illustrate how this embedding can be used as the ideal embedding for answer inference. For adversarial learning, we use two embedding generators (question–image embedding and a question-answer embedding generator) and a discriminator to differentiate the two embeddings. The questionanswer embedding is used as the ideal embedding and the question-image embedding is improved in reference to that. The experiment results indicate that pre-training the question-image embedding generation module using adversarial learning improves overall performance, implying the effectiveness of the proposed method.

Browse

Browsing MERCon - 2022 by Subject "Adversarial learning"