Abstract

ChatGPT is a conversational AI platform that can produce code to solve problems when provided with a natural language prompt. Prior work on similar AI models has shown that they perform well on typical intro-level Computer Science problems. However, little is known about the performance of such tools on Data Science (DS) problems. In this work, we assess the performance of ChatGPT on assignments from three DS courses with varying difficulty levels. First, we apply the raw assignment prompts provided to the students and find that ChatGPT performs well on assignments with dataset(s) descriptions and progressive question prompts, which divide the programming requirements into sub-problems. Then, we perform prompt engineering on the assignments for which ChatGPT had low performance. We find that the following prompt engineering techniques significantly increased ChatGPT’s performance: breaking down abstract questions into steps, breaking down steps into multiple prompts, providing descriptions of the dataset(s), including algorithmic details, adding specific instructions to entice specific actions, and removing extraneous information. Finally, we discuss how our findings suggest potential changes to curriculum design of DS courses.

Updated: