Image retargeting aims to change the aspect-ratio of an image while maintaining its content and structure with less visual artifacts. Existing methods still generate many artifacts or fail to maintain original content or structure. To address this, we introduce HALO, an end-to-end trainable solution for image retargeting. Since humans are more sensitive to distortions in salient areas than non-salient areas of an image, HALO decomposes the input image into salient/non-salient layers and applies different wrapping fields to different layers. To further minimize the structure distortion in the output images, we propose perceptual structure similarity loss which measures the structure similarity between input and output images and aligns with human perception. Both quantitative results and a user study on the RetargetMe dataset show that HALO achieves SOTA. Especially, our method achieves an 18.4% higher user preference compared to the baselines on average.
View on arXiv@article{xu2025_2504.03026, title={ HALO: Human-Aligned End-to-end Image Retargeting with Layered Transformations }, author={ Yiran Xu and Siqi Xie and Zhuofang Li and Harris Shadmany and Yinxiao Li and Luciano Sbaiz and Miaosen Wang and Junjie Ke and Jose Lezama and Hang Qi and Han Zhang and Jesse Berent and Ming-Hsuan Yang and Irfan Essa and Jia-Bin Huang and Feng Yang }, journal={arXiv preprint arXiv:2504.03026}, year={ 2025 } }