Exploring Large Language Models and the Metaverse for Urologic Applications: Potential, Challenges, and the Path Forward
Article information
Abstract
The metaverse, a 3-dimensional digital platform that enables users to interact and engage in realistic virtual activities beyond time and space limitations, has garnered significant investment across industries, particularly in healthcare. In the medical field, the metaverse shows promise as a digital therapeutic platform to enhance interaction between medical professionals and patients. Concurrently, generative artificial intelligence, especially large language models, is being integrated into healthcare for applications in data analysis, image recognition, and natural language processing. In urology, large language models (LLMs) support are increasingly used in urology for tasks such as image diagnosis, data processing, patient education, and treatment assistance in order to provide significant support in clinical settings. By combining LLMs with the immersive capabilities of the metaverse, new possibilities emerge to improve urologic treatment in areas that require consistent treatments, habit formation, and long-term management. This paper reviews current research and applications of LLMs in urology, discusses the challenges associated with their use including data quality, bias, security, and ethical issues, and explores the need for regulatory standards. Furthermore, it highlights the potential of a metaverse-based digital platform to improve urologic care and streamline information exchange to maximize the benefits of this integrated approach in future healthcare applications.
INTRODUCTION
The metaverse, a term combining ‘meta,’ meaning virtual or transcendent and ‘universe,’ refers to a digital platform that enables users to engage in various real-world activities through their avatars within a 3-dimensional virtual environment [1,2]. Digital therapeutics, on the other hand, is a term used to describe the treatment, prevention, and management of chronic diseases or symptoms requiring ongoing care through digital software, as opposed to traditional treatments like medication or surgery. Integrating digital therapeutics into the virtual metaverse environment enables the delivery of diverse medical services within this immersive space [3,4]. Above all, metaversebased digital therapeutics are recognized as a groundbreaking technology that offers immersive, interactive, and highly personalized learning environments that can enhance patient engagement and improve treatment outcomes [3]. Table 1 summarizes healthcare applications of digital therapeutics within the metaverse platforms [2,3].
The integration of artificial intelligence (AI) into metaversebased digital therapeutics has significantly expanded its applications in the medical field. AI is now widely utilized for a variety of tasks including diagnosing and treating diseases, developing medical robots, interpreting and generating medical images in order to predict surgical outcomes and analyze treatment data [5]. A primary advantage of AI in healthcare is its capacity to process large volumes of medical data with speed and accuracy. In the field of urology, AI has been actively implemented for disease diagnosis, risk analysis, prevention, treatment, and patient follow-up [5,6]. Table 2 summarizes the use cases and types of AI applications within a field of urology [4,5,7-12].
Generative AI technologies, particularly large language models (LLMs), are increasingly being adopted across the healthcare sector to address complex challenges such as image recognition and NLP. Using deep learning and multilayer neural networks to analyze extensive patient data, LLMs make medical resources and information more accessible to patients. They also enhance the efficiency of healthcare providers by automating the analysis of medical imaging, pathology, clinical text data, and diagnostic support, ultimately in order to reduce treatment costs and associated risks. This, in turn, improves patient quality of life and satisfaction [6].
In urology, LLMs support healthcare providers during the treatment process, assist in patient education by answering questions about urological conditions, treatments, and procedures, and facilitate informed decision-making for patients [13-15]. Despite their advantage, these AI-driven tools raise important challenges by highlighting the urgent need for policies that ensure the safe and ethical use of generative AI in medical and scientific contexts [13,14,16]. Thus, discussions are essential to explore how generative AI can create tangible business value in healthcare [16].
This paper reviews current research and applications of LLMs, a crucial aspect of generative AI, within the field of urology. It also examines the potential and need for digital platforms like the metaverse to provide enhanced urological care and support efficient, accessible information exchange services.
CONCEPT AND CURRENT STATUS OF LLMS
LLMs are sophisticated AI systems designed for natural language processing (NLP) tasks with the capability to generate coherent text by learning language patterns. Generally, LLMs care characterized by their large scale, often exceeding 10 billion parameters, which equips them to handle and generate complex language outputs. Fig. 1 illustrates the training process of a LLM, which involves several key steps such as pretraining followed by supervised fine-tuning and then, reinforcement learning with reward modeling. Once these stages are completed, the trained model is ready for deployment [6,17].
In the medical field, the extensive amount of treatment and clinical data provides a valuable foundation for developing specialized LLMs tailored to healthcare needs. As a result, these models are being increasingly applied across medical activities including clinical assistance, medical education, and research. Table 3 summarizes examples of LLMs applications in medicine [6,13,14,18].
While LLMs hold considerable promise across fields such as paramedical care, medical education, data cleansing, and clinical research, significant challenges need to be addressed to ensure their successful integration [6,19]. For instance, IBM Watson, an early pioneer in AI, was trained with thousands of cues and classified responses to analyze patterns within large, complex datasets. Watson’s models are capable of responding to natural language queries and was intended to serve as a therapeutic assistant that offers treatment recommendations based on extensive medical literature including patient records, journals, textbooks, and drug datasets. However, IBM Watson faced substantial implementation and maintenance issues that ultimately limited its business impact. Difficulties arose from handling unstructured medical data through NLP, inconsistencies in training data due to regional variations, and the gap between the model’s capabilities and marketing expectations [10].
CHALLENGES OF APPLYING LLMS IN THE MEDICAL FIELD
Data Quality Issues
The medical field is generally well-suited for LLMs because its vast stores of patient records and treatment data are available in electronic forms [6]. While structured data like lab results and vital signs can be efficiently processed in deep learning applications, unstructured text such as physician notes, admission summaries, and discharge summaries poses significant data quality challenges. Variability in documentation styles, terminology, and data entry practices can lead to inconsistencies that affect the accuracy and effectiveness of LLMs [6,10,16,20]. For example, as mentioned before like IBM Watson, a statistical approach often reinforces its pre-existing conclusions without incorporating meaningful insights from smaller, nuanced data subsets. It eventually leads to inaccuracies [10]. Also, if LLMs are trained on biased or inaccurate data, they may produce skewed responses on sensitive topics such as discrimination, politics, and gender, which could reinforce misinformation, a phenomenon often referred to as AI bias. Furthermore, LLMs may experience AI hallucinations, where they confidently provide incorrect answers to questions outside their training scope. Both bias and hallucinations may introduce a distorted perspective in the AI’s responses and potentially mislead users [1,6,13-15,18,21].
Data And Privacy Security
Training of LLMs requires vast datasets that include sensitive patient information such as medical records, genomic data, and personal identifiers protected under the Health Insurance Portability and Accountability Act, along with licensed third-party, public, and synthetic data. This extensive use of data raises the risk of security including breaches, unauthorized access, data misuse, and intellectual property infringement [6,13,16-20,22,23]. Moreover, because LLMs use complex neural networks to derive insights through data integration and machine learning, the internal decision-making processes are often opaque, which complicates the verification of their responses. This lack of transparency can impede trust in AI-generated recommendations, as the inability to fully understand or explain these decisions raises concerns about the reliability of AI in clinical settings [1,5,6,21]. As shown in Fig. 2, a survey of 2,770 participants globally indicated that a significant percentage of respondents had “high” or “very high” concerns about data management and security in generative AI. These concerns align with those voiced by professionals and business leaders regarding data security in the generative AI field [16,23].
Other Challenges
Additional challenges further complicate LLMs integration in healthcare. First, there is a strong consensus—78% of respondents— supporting the need for governments and organizations to establish regulatory standards to guide the development and adoption of generative AI. However, such regulatory frameworks could introduce uncertainties in terms of policy directions and enforcement practices, potentially affecting the adaptability and innovation of these technologies [16]. Second, the lack of standardized frameworks for defining and measuring the impact of generative AI hampers the ability to evaluate its effectiveness accurately. To sustain interest and investment, clear evaluation criteria are essential for assessing the value of generative AI applications [16]. Third, the computational demands of LLMs add another layer of complexity, as they require high-performance hardware with multiple parallel processing units and extensive memory capacity for data processing and model training. These requirements can make implementation of LLMs costly so that it presents barriers to their wide-scale adoption [1,13]. Finally, because LLMs produce sophisticated outputs, they may evade plagiarism detection and raise ethical concerns. An over-reliance on AI-generated content may discourage independent analysis and creativity, as users may become overly dependent on these models rather than fostering critical thinking skills [14].
These combined factors underscore the need for balanced regulatory oversight, rigorous impact assessment, robust computational infrastructure, and a cautious approach to LLM usage to ensure ethical, effective, and sustainable adoption in the healthcare field.
APPLICABILITY AND UTILITY OF THE LLMS IN UROLOGY
LLMs are increasingly utilized in urology where they serve as tools for both clinical support and education. A variety of urology-specific LLMs and general-purpose models have shown promising results in improving diagnostic accuracy, educational efficacy, and patient engagement within this specialty.
Uro_Chat
Uro_Chat, a generative AI model based on GPT-3.5-Turbo, is trained with information from the European Urology Guidelines and has demonstrated high accuracy in answering In-Service Assessment questions from the European Board of Urology (EBU). Uro-Chat was tested with a mix of oncology and nononcology questions and the model demonstrated stronger performance on oncology-specific questions. It achieved a correct answer rate of 80% on oncology questions versus 52.3% on nononcology questions in the first round. In the second round, it scored 74.3% on oncology questions and 55.4% on nononcology questions, while in the third round, it maintained a score of 74.3% for oncology compared to 56.9% for nononcology questions. These results suggest that Uro_Chat’s potential as an educational tool for trainees and a decision-support resource for specialists. The model’s utility is enhanced when it is continually trained on accurate, targeted data so that it enables personalized consultations based on specific patient symptoms and clinical history [17,24].
ChatGPT
Chat generative pre-trained transformer (ChatGPT)-3.5 has shown impressive results in the medical field, notably by demonstrating the reasoning skills needed to pass the United States Medical Licensing Examination [17]. In urology, ChatGPT offers substantial support for administrative tasks such as clinical notetaking, chart organization, and discharge summaries, optimizing practice efficiency and allowing urologists to devote more time to patient care. It also assists specialists by summarizing clinical histories, generating differential diagnoses, recommending preliminary tests, and documenting consultations. While it does not replace clinical expertise, ChatGPT provides evidence-based recommendations align with the patient symptoms.
A notable advantage of ChatGPT is its multilingual capabilities in order to support languages such as English, Korean, Chinese, and Spanish. Thus, it can broaden access to medically accurate knowledge globally. In test scenarios related to urinary stone diagnosis and treatment, ChatGPT demonstrated a 95% accuracy rate, which enables patients and their family members to inquire about urological conditions, treatments, and medications. This functionality improve patient understanding and fosters more informed, active engagement during face-to-face consultations [25-27].
UroBot-4o
UroBot-4o is a urology-specific chatbot developed in collaboration with the German Cancer Research Center (Deutsches Krebsforschungszentrum, DKFZ) and the University Hospital Mannheim. It is based on OpenAI’s GPT-4o model. By incorporating the latest 2023 European Association of Urology guidelines, UroBot-4o enhances its accuracy with a Retrieval-Augmented Generation (RAG) method that draws information from trusted external databases. This hybrid approach enabled UroBot-4o to achieve an average correct answer rate of 88.4% on 200 intraservice assessments from the EBU. It outperformed the GPT-4o model by 10.8% and far exceeded the average performance of practicing urologists at 68.7%. These results indicate UroBot-4o’s potential as a reliable clinical support tool as it can effectively address complex urological inquiries [28].
Pca-MLM
Pca-MLM is a domain-specific LLM developed for research in prostate cancer diagnostics. In contrast to broader models like GPT or those tailored to general medical and biological fields such as MedPALM, MedAlpaca, and BioGPT, Pca-MLM employs self-supervised learning within a focused domain. With 124 million parameters, Pca-MLM employs a 12-layer masked self-attention decoder to process clinical records, radiology data, and pathology reports specific to prostate cancer. In trials, Pca-MLM outperformed similarly scaled models like GPT-2 and even BioGPT, which is three times larger, in accurately answering prostate cancer-related questions and handling masked clinical data. Moreover, Pca-MLM has shown a lower risk of AI biases and hallucinations that are more common in generalpurpose LLMs. This result underscores the benefits of smaller, specialized LLMs in high-risk clinical areas and it suggests that targeted models can offer greater precision and safety [17,22].
THE NEED TO INTRODUCE LLMS IN A METAVERSE FOR UROLOGY
The Need to Integrate Urology into the Metaverse Environment
Urology, especially in cases involving long-term management such as pediatric urology, demands consistent monitoring and lifestyle adjustments including managing urination habits, hydration, and exercise [29]. These ongoing demands can impact patients’ quality of life, eventually affecting their self-esteem, social interactions, and body image. Even after treatment, patients often require continuous counseling and follow-up care to manage prognosis and prevent recurrence, which can be physical and psychological strain for both patients and their families [22,30].
In this regard, a metaverse-based platform provides a promising solution by offering an immersive space where treatment, patient education, and engagement are enhanced. Integrating interactive 3-dimensional (3D) graphics and gamified elements in a metaverse environment could improve patient motivation, learning outcomes, and treatment adherence [31]. The utility of immersive environments in medical education has been demonstrated, such as in a 2021 surgical training session where over 200 cardiac and thoracic surgeons used head-mounted displays to participate in virtual training sessions alongside live operations at Seoul National University Bundang Hospital in South Korea. Applying similar experiences to urology could allow global collaboration among specialists, trainees, and patients in a fully interactive virtual setting [2,32].
Further studies underscore the potential impact of metaverse-based learning. Approximately half of patients typically struggle to recall or understand the information discussed during medical visit, but cancer patients who receive 3D, 360° Virtual Reality representations of their anatomy report higher satisfaction and comprehension. Furthermore, the metaverse enables patients and clinicians’ avatars to interact across spatial and temporal divides, so that it can support virtual engagement where 3D models aid in treatment planning. For example, 3D virtual models are used in robotic surgical systems to improve precision in procedures such as prostate and kidney surgeries by overlaying virtual anatomical data onto the patient’s body during surgery [23,27,32,33].
Effect of LLM Application in a Urologic Metaverse Environment
Patients with urological conditions, such as prostate cancer, often experience feelings of embarrassment and may hesitate to openly discuss their symptoms with physicians. This reluctance sometimes drives them to seek information online, where they risk encountering unreliable or inaccurate sources that could negatively affect their treatment outcomes [22]. In a metaverse environment, patients can use avatars to participate in social and medical interactions while preserving anonymity. This setting allows for private, controlled information sharing so that patients can consult with healthcare providers without the exposure of sensitive personal data. These interactions also generate urology-specific data, which can be utilized to develop specialized LLMs. Training LLMs on such targeted data can help reduce bias and mitigate AI hallucinations that often occur in broader, generalized models, thus enhancing their accuracy and clinical utility [3,22].
For example, In South Korea, Polytechnic University has implemented ChatGPT within a metaverse platform where patients interact with nonplayer characters to enhance learning. To minimize data bias and hallucinations, the RAG method is used that connects the AI model to reliable external sources like Wikipedia. This approach, used by UroBot-4o, improves AI reliability by generating evidence-based responses [1,28]. Similarly, models like Pca-MLM, which are trained on specific, domain-focused data, reduce the likelihood of errors often associated with generalized LLMs, and they further improve accuracy for urology-specific applications [22].
Integrating LLMs into a metaverse environment offers urology patients on-demand access to essential information and facilitates education and training free from time and location constraints. This approach leverages the capabilities of virtual digital twin environments, where urologists can gather and analyze data from patient consultations so that it can lead to the personalized treatment and proactive disease management [1,3,33,34]. The metaverse also opens opportunities for knowledge exchange, as urologists can participate in virtual conferences, research collaborations, and educational initiatives, further enhancing collective expertise and feeding valuable data back into LLMs for continuous refinement. This synergy between LLMs and the metaverse promises to drive innovation and improving patient outcomes in the field of urology [35].
CONCLUSION
The metaverse has received significant investment across various sectors including healthcare, where it is being explored as a potential digital treatment platform to enhance interaction between medical professionals and patients. In parallel, generative AI, particularly LLMs, is being actively integrated into healthcare due to its capabilities in deep learning-based data analysis, image recognition, and NLP. LLMs, which analyze linguistic patterns to provide accurate responses to queries, are increasingly applied in urology to support tasks such as image diagnosis, data processing, medical education, and patient care.
The integration of the metaverse with LLMs marks a promising area of advancement as it merges immersive virtual environments with the capabilities of generative AI to enhance urologic treatment and counseling. This synergy is particularly beneficial in areas that require regular treatments, habit formation, and long-term management, as it opens new possibilities for improving patient engagement and adherence. In this regard, this paper reviews current applications and research on LLMs in the field of urology and addresses essential considerations for their safe integration into medical practice. Also, it highlights the potential and necessity of a metaverse-based digital platform to improve urologic care and facilitate robust information exchange with the aim of harnessing the full potential of this combined approach in future medical applications.
Notes
Grant/Fund Support
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2022R1F1A1066602) and this research was supported by Culture, Sports, and Tourism R&D Program through the Korea Creative Content Agency grant funded by the Ministry of Culture, Sports, and Tourism in 2023 (Project Name: Cultural Technology Specialist Training and Project for Metaverse Game, Project Number: RS-2023-00227648, Contribution Rate: 100%).
Conflict of Interest
No potential conflict of interest relevant to this article was reported.
AUTHOR CONTRIBUTION STATEMENT
· Conceptualization: HJP, JYK
· Data curation: HJP
· Formal analysis: HJP, EJK
· Funding acquisition: JYK
· Methodology: HJP, EJK, JYK
· Project administration: JYK
· Visualization: HJP, EJK
· Writing - original draft: HJP, EJK
· Writing - review & editing: HJP, EJK, JYK