29
30

ALF: Advertiser Large Foundation Model for Multi-Modal Advertiser Understanding

Abstract

We present ALF (Advertiser Large Foundation model), a multi-modal transformer architecture for understanding advertiser behavior and intent across text, image, video and structured data modalities. Through contrastive learning and multi-task optimization, ALF creates unified advertiser representations that capture both content and behavioral patterns. Our model achieves state-of-the-art performance on critical tasks including fraud detection, policy violation identification, and advertiser similarity matching. In production deployment, ALF reduces false positives by 90% while maintaining 99.8% precision on abuse detection tasks. The architecture's effectiveness stems from its novel combination of multi-modal transformations, inter-sample attention mechanism, spectrally normalized projections, and calibrated probabilistic outputs.

View on arXiv
@article{rajagopalan2025_2504.18785,
  title={ ALF: Advertiser Large Foundation Model for Multi-Modal Advertiser Understanding },
  author={ Santosh Rajagopalan and Jonathan Vronsky and Songbai Yan and S. Alireza Golestaneh and Shubhra Chandra and Min Zhou },
  journal={arXiv preprint arXiv:2504.18785},
  year={ 2025 }
}
Comments on this paper