ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.17168
651
1
v1v2v3v4v5 (latest)

Enabling Population-Level Parallelism in Tree-Based Genetic Programming for GPU Acceleration

21 January 2025
Zhihong Wu
Lishuang Wang
Kebin Sun
Zhuozhao Li
Ran Cheng
ArXiv (abs)PDFHTMLGithub (178★)
Main:12 Pages
13 Figures
Bibliography:3 Pages
16 Tables
Appendix:4 Pages
Abstract

Tree-based Genetic Programming (TGP) is a widely used evolutionary algorithm for tasks such as symbolic regression, classification, and robotic control. Due to the intensive computational demands of running TGP, GPU acceleration is crucial for achieving scalable performance. However, efficient GPU-based execution of TGP remains challenging, primarily due to three core issues: (1) the structural heterogeneity of program individuals, (2) the complexity of integrating multiple levels of parallelism, and (3) the incompatibility between high-performance CUDA execution and flexible Python-based environments. To address these issues, we propose EvoGP, a high-performance framework tailored for GPU acceleration of TGP via population-level parallel execution. First, EvoGP introduces a tensorized representation that encodes variable-sized trees into fixed-shape, memory-aligned arrays, enabling uniform memory access and parallel computation across diverse individuals. Second, EvoGP adopts an adaptive parallelism strategy that dynamically combines intra- and inter-individual parallelism based on dataset size, ensuring high GPU utilization across a broad spectrum of tasks. Third, EvoGP embeds custom CUDA kernels into the PyTorch runtime, achieving seamless integration with Python-based environments such as Gym, MuJoCo, Brax, and Genesis. Experiments show that EvoGP attains a peak throughput exceeding 101110^{11}1011 GPops/s, with speedups of up to 528×528\times528× over GPU-based TGP implementations and 18×18\times18× over the fastest CPU-based libraries, while maintaining comparable accuracy and improved scalability across large population sizes. EvoGP is open source and accessible at:this https URL.

View on arXiv
Comments on this paper