ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.01518
50
1

Hybrid Machine Learning Model for Detecting Bangla Smishing Text Using BERT and Character-Level CNN

3 February 2025
Gazi Tanbhir
Md. Farhan Shahriyar
Khandker Shahed
Abdullah Md Raihan Chy
Md Al Adnan
ArXivPDFHTML
Abstract

Smishing is a social engineering attack using SMS containing malicious content to deceive individuals into disclosing sensitive information or transferring money to cybercriminals. Smishing attacks have surged by 328%, posing a major threat to mobile users, with losses exceeding \54.2millionin2019.Despiteitsgrowingprevalence,theissueremainssignificantlyunder−addressed.ThispaperpresentsanovelhybridmachinelearningmodelfordetectingBanglasmishingtexts,combiningBidirectionalEncoderRepresentationsfromTransformers(BERT)withConvolutionalNeuralNetworks(CNNs)forenhancedcharacter−levelanalysis.Ourmodeladdressesmulti−classclassificationbydistinguishingbetweenNormal,Promotional,andSmishingSMS.Unliketraditionalbinaryclassificationmethods,ourapproachintegratesBERT′scontextualembeddingswithCNN′scharacter−levelfeatures,improvingdetectionaccuracy.Enhancedbyanattentionmechanism,themodeleffectivelyprioritizescrucialtextsegments.Ourmodelachieves98.4754.2 million in 2019. Despite its growing prevalence, the issue remains significantly under-addressed. This paper presents a novel hybrid machine learning model for detecting Bangla smishing texts, combining Bidirectional Encoder Representations from Transformers (BERT) with Convolutional Neural Networks (CNNs) for enhanced character-level analysis.Our model addresses multi-class classification by distinguishing between Normal, Promotional, and Smishing SMS. Unlike traditional binary classification methods, our approach integrates BERT's contextual embeddings with CNN's character-level features, improving detection accuracy. Enhanced by an attention mechanism, the model effectively prioritizes crucial text segments. Our model achieves 98.47% accuracy, outperforming traditional classifiers, with high precision and recall in Smishing detection, and strong performance across all categories.54.2millionin2019.Despiteitsgrowingprevalence,theissueremainssignificantlyunder−addressed.ThispaperpresentsanovelhybridmachinelearningmodelfordetectingBanglasmishingtexts,combiningBidirectionalEncoderRepresentationsfromTransformers(BERT)withConvolutionalNeuralNetworks(CNNs)forenhancedcharacter−levelanalysis.Ourmodeladdressesmulti−classclassificationbydistinguishingbetweenNormal,Promotional,andSmishingSMS.Unliketraditionalbinaryclassificationmethods,ourapproachintegratesBERT′scontextualembeddingswithCNN′scharacter−levelfeatures,improvingdetectionaccuracy.Enhancedbyanattentionmechanism,themodeleffectivelyprioritizescrucialtextsegments.Ourmodelachieves98.47

View on arXiv
@article{tanbhir2025_2502.01518,
  title={ Hybrid Machine Learning Model for Detecting Bangla Smishing Text Using BERT and Character-Level CNN },
  author={ Gazi Tanbhir and Md. Farhan Shahriyar and Khandker Shahed and Abdullah Md Raihan Chy and Md Al Adnan },
  journal={arXiv preprint arXiv:2502.01518},
  year={ 2025 }
}
Comments on this paper