Machine learning techniques play an important role in analyzing spectral data. The spectral data of fish biomass is useful in fish production, as it carries many important chemistry properties of fish meat. However, it is challenging for existing machine learning techniques to comprehensively discover hidden patterns from fish biomass spectral data since the spectral data often have a lot of noises while the training data are quite limited. To better analyze fish biomass spectral data, this paper models it as a symbolic regression problem and solves it by a linear genetic programming method with newly proposed tunable primitives. In the symbolic regression problem, linear genetic programming automatically synthesizes regression models based on the given primitives and training data. The tunable primitives further improve the approximation ability of the regression models by tuning their inherent coefficients. Our empirical results over ten fish biomass targets show that the proposed method improves the overall performance of fish biomass composition prediction. The synthesized regression models are compact and have good interpretability, which allow us to highlight useful features over the spectrum. Our further investigation also verifies the good generality of the proposed method across various spectral data treatments and other symbolic regression problems.
View on arXiv@article{huang2025_2505.21901, title={ Symbolically Regressing Fish Biomass Spectral Data: A Linear Genetic Programming Method with Tunable Primitives }, author={ Zhixing Huang and Bing Xue and Mengjie Zhang and Jeremy S. Ronney and Keith C. Gordon and Daniel P. Killeen }, journal={arXiv preprint arXiv:2505.21901}, year={ 2025 } }