Novel Development of LLM Driven mCODE Data Model for Improved Clinical
Trial Matching to Enable Standardization and Interoperability in Oncology
Research
Each year, the lack of efficient data standardization and interoperability in
cancer care contributes to the severe lack of timely and effective diagnosis,
while constantly adding to the burden of cost, with cancer costs nationally
reaching over 208billionin2023alone.Traditionalmethodsregardingclinicaltrialenrollmentandclinicalcareinoncologyareoftenmanual,time−consuming,andlackadata−drivenapproach.Thispaperpresentsanovelframeworktostreamlinestandardization,interoperability,andexchangeofcancerdomainsandenhancetheintegrationofoncology−basedEHRsacrossdisparatehealthcaresystems.ThispaperutilizesadvancedLLMsandComputerEngineeringtostreamlinecancerclinicaltrialsanddiscovery.ByutilizingFHIR′sresource−basedapproachandLLM−generatedmCODEprofiles,weensuretimely,accurate,andefficientsharingofpatientinformationacrossdisparatehealthcaresystems.Ourmethodologyinvolvestransformingunstructuredpatienttreatmentdata,PDFs,free−textinformation,andprogressnotesintoenrichedmCODEprofiles,facilitatingseamlessintegrationwithournovelAIandML−basedclinicaltrialmatchingengine.Theresultsofthisstudyshowasignificantimprovementindatastandardization,withaccuracyratesofourtrainedLLMpeakingatover92patientdata.Additionally,ourLLMdemonstratedanaccuracyrateof87SNOMED−CT,90statusquo,withLLMssuchasGPT−4andClaude′s3.5peakingatanaverageof77andinteroperabilityframework,pavingthewayformoreefficientandpersonalizedcancertreatment.