Large Language Models (LLMs) have signif- icantly advanced natural language processing, demonstrating strong capabilities in tasks such as text generation, summarization, and reason- ing. Recently, their potential for automating precise text editing tasks across specialized do- mains, such as programming code, LaTeX, and structured database languages, has gained atten- tion. However, current state-of-the-art LLMs still struggle with executing precise, instruction- driven edits, particularly when structural ac- curacy and strict adherence to domain con- ventions are required. To address these chal- lenges, we introduce InstrEditBench, an au- tomated benchmark dataset comprising over 30,000 structured editing tasks spanning di- verse domains, including Wikipedia articles, LaTeX documents, source code, and database languages. Using this benchmark, we develop FineEdit, a specialized editing model explicitly trained for accurate, context-aware text mod- ifications. Experimental evaluations demon- strate that FineEdit outperforms state-of-the-art models, achieving improvements of approxi- mately 10% over Gemini models on single-turn edits, up to 30% over Llama-3.2-3B, and ex- ceeding Mistral-7B-OpenOrca performance by over 40% on direct editing tasks. FineEdit also effectively generalizes to realistic multi-turn editing scenarios, highlighting its practical ap- plicability.

View on arXiv

Comments on this paper