v1v2v3v4 (latest)

GORAG: Graph-based Online Retrieval Augmented Generation for Dynamic Few-shot Social Media Text Classification

6 January 2025

ArXiv (abs)PDF HTML Github

Main:8 Pages

5 Figures

Bibliography:3 Pages

8 Tables

Appendix:1 Pages

Abstract

Text classification is vital for Web for Good applications like hate speech and misinformation detection. However, traditional models (e.g., BERT) often fail in dynamic few-shot settings where labeled data are scarce, and target labels frequently evolve. While Large Language Models (LLMs) show promise in few-shot settings, their performance is often hindered by increased input size in dynamic evolving scenarios. To address these issues, we propose GORAG, a Graph-based Online Retrieval-Augmented Generation framework for dynamic few-shot text classification. GORAG constructs and maintains a weighted graph of keywords and text labels, representing their correlations as edges. To model these correlations, GORAG employs an edge weighting mechanism to prioritize the importance and reliability of extracted information and dynamically retrieves relevant context using a tailored minimum-cost spanning tree for each input. Empirical evaluations show GORAG outperforms existing approaches by providing more comprehensive and precise contextual information. Our code is released at:this https URL.

View on arXiv

Comments on this paper