Quiz time. Which of these four learning environments involving the use — or non-use — of artificial intelligence results in the best academic outcomes for college students?
- AI-generated content and AI delivery
- AI-generated content and human delivery
- Human-generated content and human delivery
- Human-generated content and AI delivery
To find the answer, Emory Goizueta Business School Professor Rajiv Garg enlisted nearly six dozen undergraduate students across multiple disciplines for an in-class experiment. And what he discovered may surprise you and change your mind about the potential of AI in the classroom — for MBAs, undergraduate B-school students, and everyone else.
FINDINGS SHOW HUGE UPSIDE TO AI IN CLASSROOMS
Beginning last semester and continuing through last week, Garg, an associate professor of information systems and operation management at Emory Goizueta, conducted a pilot study comparing AI-generated courses and human-generated courses. First he had a data sciences master’s graduate create and deliver an online course on the database creation language SQL. Then he asked ChatGPT to create one as well. Then he had the graduate deliver the AI-created course, and an AI avatar deliver the human-created course.
In the end, students took four courses: one that was pure AI, one that was pure human, one that was AI content with a human voice/avatar, and one that was human content with an AI voice/avatar.
Garg and his team ran the experiment with 40 undergraduate students last semester and another 30 this spring. For the courses that had AI-delivered content, the team created an avatar (via HeyGen) and fed it a script. Each group in Garg’s study completed the course and took a short exam at the end. The result: Garg and his team found that student performance was highest for the course that had human-generated content delivered by AI voice/avatar. The second best was a purely human-generated course, followed by a purely AI-generated course — and the worst was an AI-generated course with a human voice/avatar.
The main takeaway: Course content generated by people — experts and even those who are average instructors — still beats content generated by AI. But the content can be delivered more effectively by an avatar.
“I found that students achieved an average of 5.7% more points on quizzes after attending a purely human-generated and delivered course compared to students who attended a purely AI-generated and delivered course,” Garg tells Poets&Quants. “Furthermore, students who attended a hybrid human-generated and AI-delivered course gained, on average, 4.3 additional points compared to a pure human-generated and delivered course. Finally, students who attended the hybrid AI-generated and human-delivered course received, on average, 2.7 fewer points when compared to a purely AI-generated and delivered course.
“Thus, human-generated content is superior to AI-generated content for higher education, whereas AI-generated delivery — voice and avatar — can enhance students’ learning.”
It wasn’t the outcome he expected.
“For a hypothesis, I was unsure between the AI and human content as we are still exploring the quality of the generative AI, but I expected human delivery — voice and avatar — to perform better than AI voice and avatar. I was proved wrong — though I am not very surprised because my research also shows that voices could also play a role in information seeking behavior.”
QUIZ RESULTS IN RAJIV GARG’S STUDY
AI-generated content and AI-delivery | 79.2% |
AI-generated content and Human-delivery | 76.5% |
Human-generated content and Human-delivery | 84.9% |
Human-generated content and AI-delivery | 89.2% |
Video examples of Rajiv’s research:
Pure-AI: https://www.youtube.com/watch?v=MpHXpjGoKVw
Pure-Human: https://www.youtube.com/watch?v=zB4FWKjIm-w
WATCHING AT 1.5X SPEED HELPED TO MODULATE THE AI-GENERATED VOICE
Why did B-school students learn best when the content of an online course was human-generated but delivered via AI voice and avatar? In part, Garg says, it has to do with a common practice among students in online programs — watching lectures at 1.5x speed — and what that does to the modulation of a teacher’s voice.
“Now, when you are watching content at 1.5X speed and your voice has modulation, it sounds worse,” Garg says. “But if your voice is flat, it’s easy to still comprehend. So I gave them the freedom that they could change the speed, but I did not think they would be doing it. So in the post-course survey, when I spoke with some students, I was like, ‘Why do you think this AI voice was better for your learning?’ And they said that they listened at 1.5x speed and it’s so easy to understand what it says. I was like, ‘I never thought about that before.'” There are other wrinkles, he adds, among them the impact of accents on learning.
And while it’s possible — and expeditious — to simply use an avatar provided by a system like HeyGen’s, students respond better to an avatar created using footage of an existing teacher. The avatar can be made to match the expert’s look, mannerisms, and voice, including accent. This can be done in mere minutes — and the upsides are many; for one thing, faculty avatars can deliver content in any language — from Mandarin Chinese to French, from Arabic to Hebrew — to connect with more students in their native languages.
That’s just what Garg’s boss, Emory Goizueta Dean Gareth James, showed in a brief video message to faculty at the start of the spring 2024 semester. In it, James greeted professors and — demonstrating the potential of AI — was joined by his own avatar, an identical version of himself that could speak in perfectly fluent Chinese (or just about any other language).
James says AI has huge upside for faculty in terms of convenience, by helping them save time in keeping programs current and fresh with up-to-date class materials and content. It can also be invaluable in the customization of core courses — in giving students hyper-personalized learning experiences. “I think it’s going to continue to improve, but literally in a matter of months, not years,” he tells Poets&Quants. “So I could imagine in six months it’s looking even dramatically better than the stage that it’s at now — and so we as a school need to think about how we are going to use this.”
He agrees there will need to be guardrails.
“There are very concerning implications,” James says. “I was both excited and shocked at how easy it really was to just take two minutes of video of someone — we did it in our studio, but you don’t even have to have that high quality of video — and feed it into various software packages, and then you can just make the person say whatever you want. So suddenly all my faculty are going to get a 10% pay raise and everything! You can imagine that anything could happen. I know from a personal perspective, I certainly will no longer take video evidence of anything at face value. It’s clear that I don’t know what to take at face value anymore, but certainly that can’t be just assumed to be accurate.”
GARG: MORE STUDIES TO CONDUCT
There are those who believe we are approaching “peak AI” — that the tech’s lack of everyday utility will be its imminent undoing. Rajiv Garg does not agree.
Garg, who joined Emory in 2020 after more than a decade at the University of Texas at Austin McCombs School of Business, is a good salesman for AI, enthusiastic but also pragmatic. This follows for an academic whose research includes digital marketing strategies for social and mobile commerce and the role of digital technologies in labor markets and entrepreneurship. Garg’s work has included helping various non-profits and government organizations develop data-enabled digital strategies and policies.
So while he’s excited by the possibilities, he’s not unguarded in his praise.
“I think that the two biggest concerns (with AI) are both of a social nature,” he says. “Humans are social animals. We are not staying a social animal moving forward. If you are so immersed in these technologies and we only know to connect with people in this digital world, is it the same as my connection with people in real life? I mean, this is nonstop.
“People are becoming more lonely. Now, are we creating the kind of a setup where the moment you put on a Vision Pro, you are alive, and the moment you take it off, you are like, ‘I am alone in this space. I’m like a dead man walking’? The second part of that is our mental health. How does it impact our mental health in these situations? Are we becoming more stressed, more depressed, and maybe less creative?
“But if I am able to show that the human content is better than the AI, I think we need to start seeing, ‘Yes, we need to hire instructors to create this creative content. We need those people.’ Now, the value of the technology is a massive, large-scale delivery with personalized information for improved learning. When I learn, I could be more creative, and that should be the goal — not just that I have a checkbox on my resume that I took the SQL course and now give me a job to run SQL course.”
In the near future, Garg says, he has more studies to conduct, including a follow-up that will test a native English-speaking instructor and non-technical course materials. He says he plans to combine the results in one paper. And “While I have explored differences in text across the courses, I haven’t explored the differences in voices yet.”
See the next page for a Q&A with Rajiv Garg about his experiments with AI in the classroom; and page 3 for a Q&A with Emory Goizueta Dean Gareth James about the potential — and pitfalls — for business schools in the use of AI.