User Misconceptions of LLM-Based Conversational Programming Assistants
Programming assistants powered by large language models (LLMs) have become widely available, with conversational assistants like ChatGPT particularly accessible to novice programmers. However, varied tool capabilities and inconsistent availability of extensions (web search, code execution, retrieval-augmented generation) create opportunities for user misconceptions that may lead to over-reliance, unproductive practices, or insufficient quality control. We characterize misconceptions that users of conversational LLM-based assistants may have in programming contexts through a two-phase approach: first brainstorming and cataloging potential misconceptions, then conducting qualitative analysis of Python-programming conversations from the WildChat dataset. We find evidence that users have misplaced expectations about features like web access, code execution, and non-text outputs. We also note the potential for deeper conceptual issues around information requirements for debugging, validation, and optimization. Our findings reinforce the need for LLM-based tools to more clearly communicate their capabilities to users and empirically ground aspects that require clarification in programming contexts.
View on arXiv