When AI Helps Students Learn & When It Doesn’t: An Emory Prof’s Groundbreaking Study

When AI Helps Students Learn & When It Doesn't: An Emory Prof's Groundbreaking Study

Emory Goizueta

Q&A WITH EMORY GOIZUETA PROFESSOR RAJIV GARG

Poets&Quants: This is some fascinating and fun and interesting stuff. Tell us about your experiment and why you began it. 

Rajiv Garg: There is all this talk about online courses and people keep saying universities are not going to survive the way they are and all that stuff, so we need to disrupt or use technology to discover how we educate people. So when I started off this idea was, can I do hyper-personalization with students? And a lot of times, we were watching videos: So for example, in a Coursera video, I should know something about you as a student. I know your name, I know what industry you are in, and when I’m speaking with you, I should customize each lecture to you. Now, in a real-world setting, that is not possible. With AI, that is feasible.

So that’s what I was trying to get to, to hyper-personalization. Does it help improve the learning experiences of the students? But the first step that we took is: Is an AI-generated course or content beneficial for students in some ways? Can online courses replace in-person education?

I’m a little old-school. I think online is necessary for certain things for learning, but the in-person is also necessary for more interactive learning experiences.

So we started with the idea that I’m going to hire a person. So I’m not going to have a super-expert going to create content for a class and then I’m comparing an expert with AI, right? I want to know how AI compares with an average instructor.

So I found a student who graduated with a master’s degree, who had knowledge in the subject matter, which is SQL. And I said, “If the student creates a course on introduction to SQL, and I make AI create the same course, how would students’ learning be different across two courses?” I defined the learning as students’ performance in a test after every module. So every 10 minutes, we quizzed students in both the courses and we said, “Is your performance here different than the performance here?” So that’s how I started this whole thing.

So we had this one person with the master’s degree in data science and said, “You create an introduction to SQL course, total of 60 minutes, with six modules, 10 minutes each.” And so this person created that course. Then we asked the AI to create sort of a plan for a course that is about 60 minutes for introduction to SQL. Then for each module, we gave specific problems like “Create so many slides to cover this module, then write the script on what you would talk about during this module.”

So we had ChatGPT create every single thing, from the slide content to what needs to be taught or spoken — the whole plan for the course. And the data sciences person did the whole thing herself: creating the slides, creating content, the speech on what she was going to talk about during each of these slides. And so then we had both of these courses. So the human is very easy. The person recorded on a camera, we removed the background from a green screen, we superimposed the person on a slide video. The person delivered six 10-minute modules on introduction to SQL with a quiz after every 10-minute module.

For AI — because now we had the script that ChatGPT created for every slide — we then trained a model with the video of an avatar, which looked like a person. And this person is now speaking within a person-like voice, but generated by AI. So the slide graphics were created using AI, using DALL-E, Midjourney and Beautiful.ai. So we used only AI for this course.

We had no human creativity in the AI course. We had no AI creativity in the human course. We delivered both the courses and we said, “We’re going to figure out what is better.”

The one challenge in the beginning, before I executed for a larger population of students, is, what if the person that I hired has a little accent? Because she’s Chinese, originally from China — she got her higher education in America, so she can speak, but she still has that accent. Whereas the AI is not very good with the accent. I mean, we picked a Chinese avatar, as well, the same female Chinese instructor for both of them and said, “Look, I can’t have the perfect voice, but I need as close as possible to this person without the accent.” To make sure it’s not just the voice, that the content is also playing a role, we then created two more courses: One where the whole human material that we created, we now then created the delivery by AI; and then similarly, for the AI-generated content, we had the same person deliver it, so speak the whole thing.

So four different iterations, right?

Yes. So then we did a study. So I recruited students across the university. I said, “It doesn’t have to be business school, it doesn’t have to be computer science. Anyone who doesn’t know SQL, I want to hire them.” I got about a hundred applicants, but we were able to do it for 70 students, which is still a decent number. And they all did for a $10 Starbucks gift card. So each student got one hour for $10 Starbucks.

Students will do anything for a $10 Starbucks card.

So I’m happy, they were happy, right? And they were more excited because they’re learning something about SQL, which is a really popular tool, right?

So anyway, what I found from the students’ performance was that, you probably have guessed, in the human-generated course, the student performance is better than the AI-generated course, right? So if I were to say the grade on the quiz is out of 10 points, the human-generated course had on an average about two points higher grades for students compared to the AI-generated course.

So that’s for the pure-AI to pure-human. What was interesting was, the voice thing that we were talking about, that the human content with the AI voice actually had another 0.7 point score higher than pure human.

That was the most successful one?

That was the most successful. So human-generated content with the AI delivery is the most successful.

Wow. Did that surprise you? Did you expect something else?

I was thinking that the AI voice is a little bit more flat. So if you think about when we are talking, there’s certain words that have more emphasis. My tone changes from word to word, sentence to sentence. So my hypothesis that I started with was that the pure human will do better than human-plus-AI, and the human-plus-AI will be the next level. Then will be the AI-plus-human-voice and then the AI. That’s what I thought.

Human and AI was fine, but the human-plus-AI-voice was highest, and AI-plus-human-voice was the lowest.

So I started looking into why. I have research already on voice personalization. You talk to Alexa and Siri in certain voice, in one of my research papers, we customize the voice and we identified what voice works for what person.

So I then started looking into why a human voice was not as good. And one of the elements was the accent. For some people, their learning is reduced because of the accent to the voice. They needed something that is a little bit easier.

The second thing I learned is that some students, they actually watched the lectures on 1.5X speed. Now, when you are watching content at 1.5X speed and your voice has modulation, it sounds worse. But if your voice is flat, it’s easy to still comprehend.

You could adjust for that though, right?

Yeah. So I gave them the freedom to change the speed, but I did not think they would be doing it. So in the post-survey, when I spoke with some students, I was like, “Why do you think this AI voice was better for your learning?” And they said that they listened at the 1.5X speed and it was easy to understand what it says. I was like, “I never thought about that before.”

It’s obvious that this technology would save time for instructors, that you can deliver more customized content faster. This is pretty clear. But whether students can actually learn better with it, in the right circumstances, is really a central question, and it seems that you found that they can.

Yes. So I think the role of professors in the future is that they will evolve to be content creators. I mean, that’s what professors are known for. For the research — we’re just creating something new. So in future, professors can create the content, but the AI will be responsible for delivering everything.

When it comes to voices, now you can personalize voices for better learning. So there’s a certain voice that would be better for you versus for me for just learning experiences.

There were two limitations in the study so far. One, the accent in the voice compared to the non-accent voice. What I couldn’t capture is whether the modulation in the voice is helping with the learning part. The second: the content we covered for SQL is relatively technical content, right? So it actually precludes a lot of students who maybe are not technical, because I tried to recruit from all over the university. So right now, as we speak, we are developing a course that is a more general business course. It’s a technology strategy course — how do companies adopt technology, what do they adopt? The innovator’s dilemma. And so we are using those kinds of things to create a non-technical course, and I have hired a person who is going to create the content under my supervision. Because again, I don’t want to create the whole content — we do not want an expert creating the whole thing — so I have an MBA who helped create the content.

And then I hired sort of an actor who is going to deliver the content. So then I will have the same voice kind of thing without the accent for both the courses; but now in this case, the voice actor or the instructor can modulate the voice and we can see if that actually helps in changing the learning. And we are going back to the same sort of the hypothesis, that the human voice with human content is better than the AI voice with human content.

I’m hoping I will have results for that study by the end of March. Because it takes time.

This is all in the English language?

Yes, in the English language.

As we saw from Dean Gareth James’ video, you can have the dean or anybody speak another language — Chinese, anything. A whole other world is, How does this Chinese sound to a native Chinese speaker?

That’s true. And once we establish this as a baseline, then we are opening up so many opportunities to explore in education innovation.

And knocking down one of the great barriers to education: language. Which that would be monumental.

That’s true, that’s true. Already a third project is in the works right now, but it’s in very early stages. Essentially, have you seen the Apple’s Vision Pro? So what we are doing here is, these are the courses are delivered in a two-dimensional screen, right? Now, we are actually taking it to the next level because in the Vision Pro you can have these instructors in three dimensions. So we will be doing the same kind of a thing, saying, “Look, if we have an AI-generated avatar in a three-dimension versus a human in three-dimension, the avatar has a smaller body movement or limited body movement whereas the human instructor will be more animated. How much of that impacts the learning of these students who are using Vision Pro for learning purposes?”

Do you have any concerns about this technology? Dean James has mentioned that there are always concerns when new tech comes along, and then you address them and you move on and you keep pushing and creating interesting new technology. What concerns would you say are foremost in your mind?

I think that the two biggest concerns are both of a social nature. Humans are social animals. We are not staying a social animal moving forward. If you are so immersed in these technologies and we only know to connect with people in this digital world, is it the same as my connection with people in real life? I mean, this is nonstop.

People are becoming more lonely. Now, are we creating the kind of a setup where the moment you put on a Vision Pro, you are alive, and the moment you take it off, you are like, “I am alone in this space. I’m like a dead man walking”?

That’s going to be the case for some people.

True, true. And so the second part is, the consequence of that is our mental health. How does it impact our mental health in these situations? Are we becoming more stressed, more depressed, and maybe less creative?

The key element is creativity. We can say AI is creativity, but it’s a creativity with no meaning right now. But for humans, creativity has some meaning. Our thoughts, they’re articulated in some way in that creativity. And that is something I worry about: Are we going to be losing that to some extent, if we are too immersed?

But if I am able to show that the human content is better than the AI, I think we need to start seeing, “Yes, we need to hire instructors to create this creative content. We need those people.” Now, the value of the technology is a massive, large-scale delivery with personalized information for improved learning. When I learn, I could be more creative, and that should be the goal — not just that I have a checkbox on my resume that I took the SQL course and now give me a job to run SQL course.