UPenn Shows that High Schoolers Who Use AI Incorrectly Underperform in Math

IBL News | New York

A study from the University of Pennsylvania’s Wharton School showed that High school students who use generative AI to prepare for math exams performed worse on the tests than those who didn’t use the tools.

UPenn’s report, which involved nearly a thousand students, found that access to generative AI tutors can improve student performance in practicing math problems. However, copying and pasting answers leads students to engage less with the material.

Experts say that an ideal scenario would involve a personal tutor for every student, but AI-driven learning still faces many hurdles. However, many educators have struggled to find the best ways to incorporate AI into the classroom.

“A key remaining question is how generative AI affects learning, namely, how humans acquire new skills as they perform tasks.”

This is UPenn’s explanation:

“In a field experiment, we deployed and evaluated two GPT-based tutors, one that mimics a standard ChatGPT interface (called GPT Base) and one with prompts designed to safeguard learning (called GPT Tutor).

These tutors comprise about 15% of the curriculum in each of the three grades. Consistent with prior work, our results show that access to GPT-4 significantly improves performance (48% improvement for GPT Base and 127% for GPT Tutor).

However, we additionally find that when access is subsequently taken away, students actually perform worse than those who never had access (17% reduction for GPT Base).

That is, access to GPT-4 can harm educational outcomes. These negative learning effects are largely mitigated by the safeguards included in GPT Tutor.

Our results suggest that students attempt to use GPT-4 as a “crutch” during practice problem sessions, and when successful, perform worse on their own. Thus, to maintain long-term productivity, we must be cautious when deploying generative AI to ensure humans continue to learn critical skills.”