The Impact of Large Language Models on Programming Education and Student Learning Outcomes
Abstract
Recent advancements in Large Language Models (LLMs) like ChatGPT and Copilot have led to their integration into various educational domains, including software development education. Regular use of LLMs in the learning process is still not well-researched; thus, this paper intends to fill this gap. The paper explores the nuanced impact of informal LLM usage on undergraduate students’ learning outcomes in software development education, focusing on React applications. We carefully designed an experiment involving thirty-two participants over ten weeks where we examined unrestricted but not specifically encouraged LLM use and their correlation with student performance. Our results reveal a significant negative correlation between increased LLM reliance for critical thinking-intensive tasks such as code generation and debugging and lower final grades. Furthermore, a downward trend in final grades is observed with increased average LLM use across all tasks. However, the correlation between the use of LLMs for seeking additional explanations and final grades was not as strong, indicating that LLMs may serve better as a supplementary learning tool. These findings highlight the importance of balancing LLM integration with the cultivation of independent problem-solving skills in programming education.