Researchers evaluated the bots’ performance on EyeQuiz, a platform containing ophthalmology board certification examination practice questions.
They examined the bots’ accuracy, response length, response time and provision of explanations as well as subspecialty-specific performance.
Overall, Google Gemini and Bard both had accuracies of 71% across 150 text-based multiple-choice questions.
Both bots had an acceptable performance on the exam questions, and chatbots also tended to provide a confident explanation even when providing an incorrect answer.
