In this paper, we study mid-cap companies, i.e. publicly traded companies
with less than US 10billioninmarketcapitalisation.UsingalargedatasetofUSmid−capcompaniesobservedover30years,welooktopredictthedefaultprobabilitytermstructureoverthemediumtermandunderstandwhichdatasources(i.e.fundamental,marketorpricingdata)contributemosttothedefaultrisk.Whereasexistingmethodstypicallyrequirethatdatafromdifferenttimeperiodsarefirstaggregatedandturnedintocross−sectionalfeatures,weframetheproblemasamulti−labeltime−seriesclassificationproblem.Weadapttransformermodels,astate−of−the−artdeeplearningmodelemanatingfromthenaturallanguageprocessingdomain,tothecreditriskmodellingsetting.Wealsointerpretthepredictionsofthesemodelsusingattentionheatmaps.Tooptimisethemodelfurther,wepresentacustomlossfunctionformulti−labelclassificationandanovelmulti−channelarchitecturewithdifferentialtrainingthatgivesthemodeltheabilitytouseallinputdataefficiently.Ourresultsshowtheproposeddeeplearningarchitecture′ssuperiorperformance,resultingina13receiveroperatingcharacteristicCurve)overtraditionalmodels.WealsodemonstratehowtoproduceanimportancerankingforthedifferentdatasourcesandthetemporalrelationshipsusingaShapleyapproachspecifictothesemodels.