AI in the Vidya Gaem Awards

beatstar
7 min readMay 15, 2023
Generated by DALL-E 3 with human modifications (center placement of “>” symbol and text)

AI’s the flavor of the moment, though it says little of the years of algorithims and machine learning. With the rise of generative technologies like OpenAI’s ChatGPT, Midjourney, and ElevenLabs, each have specific use-cases that are applicable to our local, 4chan community award show.

You might be wondering “but who cares?!” Well, about that…

The Vidya Gaem Awards, being an online award show on one of the largest user-generated communities on the internet (4chan), knows a little thing or two about algorithms and how they suck ass. The site does not rely on a scoring algorithm or “mixer” for recommendations.

Like anyone who grew up in the 2000s and early 2010s, our community is well familiar with text-to-speech synthesis and making robots say funny stuff. In our own show, we’ve used basic “text-to-speech” voices in at least two of our shows (2015 and 2016) to announce categories and game titles. However, in this article, I will cover our use of contemporary, “generative” content, which is stuff that’s considered to be “AI”.

Helping — not replacing — human effort

When confronted with work no one wants to do, or nobody can do on our team (in terms of ability), artificial intelligence finds itself in the in-between. We are not passing over people, cutting staff (in our volunteer project) for AI, or training AI on our past work to create a model for how future texts (like flavor text or award speeches) are written.

Instead, the generative technology we use fill gaps in our content. Some examples include providing subtitles for past Vidya Gaem Awards presentations, recapping what won each year, and generating summaries for our award speeches.

The use of artificial intelligence is not without controversy. Unabated, it raises ethical issues, such as the need for disclosure and whether or not we should use it versus asking our team (who are unpaid volunteers) to do the work instead. We ourselves cited unrestrained AI usage as the reason GTA Trilogy: The Definitive Edition won 2021’s “Biggest Technical Blunder”, and again when it won “Most Hated Game”. In both award speeches, we said something like: “An AI should never make management decisions.”

The choice to use AI is not merely to chase trends and let the robot do all of the thinking. It comes from a genuine desire to make our speeches and content easier to interpret, and when it makes sense, more funny and satirical. Let’s take a look at five examples of how the Vidya Gaem Awards use AI in our day-to-day work.

ElevenLabs — for generative voices

ElevenLabs is both a text-to-speech service, and a AI voice-cloning service. The 2022 Vidya Gaem Awards broadcast with at least four separate instances of AI text-to-speech use. The first was at the Sweep Points Explainer, with AI Barney (from Half-Life 2) providing an explanation of how our tabulation system works. Then there was the Global Acquisitions Group intro, where my own voice was cloned. There was also the “Paul Allen’s Award” presented by an AI-generated voice of Dr. Kleiner (from Half-Life 2) for best box art of all time. There were additional community-submitted skits that used the service as well.

Midjourney — for “filler content”

Don’t worry, we only used it twice. These were “posters” generated by Midjourney for Scrimblo Bimblo, a fictional pretentious indie game.

One of our 2022 nominees, the Justin Roiland game High On Life, apparently used Midjourney and other tools to fill in several blanks in their assets. So given that the Vidya Gaem Awards are rooted in satire, it only makes sense that someone like myself could try his hand in it for one award. Specifically, I used two Midjourney derived images for the “Pixels Are Ar10” award for Most Pretentious Indie Game. Above, you can see a poster for “Scrimblo Bimblo” (a fictitious indie game made in the style of Earthbound), and on the right, characters which resemble Sans, Frisk, and Flowey from Undertale.

As you can see, the output is fairly derivative and uninspired. I probably wouldn’t use it again, but the novelty of it was kinda cool I guess.

OpenAI ChatGPT — for informative descriptions

Summaries of our award speeches and some of are videos are generated by GPT-4. This example is used in the description of the “Redemption Arc Award” for biggest redemption of 2022, where Mick Gordon emerged as the winner.

On our supplemental video channel, Vidya Gaem Awards PLUS, we use ChatGPT (specifically, GPT-4) to summarize award speeches and describe video content and playlists, in addition to providing relevant tags for all our videos. This allowed our videos to surface easier and prevent description/subtitle overlap.

The reason why we decided to use ChatGPT to summarize speeches and describe video content is due to the scale of our project (over 12 years of content) and the brainpower required to summarize hundreds of speeches back-to-back. While it sometimes takes multiple tries to get the right output, using AI over human effort (when it comes to filling out YouTube descriptions) saves countless weeks of time.

When providing a generated description, the AI is given the following information for its prompt:

  1. The award name and its descriptor (e.g: Least Worst Award for least worst game of the year)
  2. The Top 5 results of the award.
  3. The contents of the speech.
  4. And optionally, a visual description of what happens in the award, in the case there is an award gag or something custom about the video.

As of this article, our individual award videos from 2018–2022 each have a speech summary that describe the speech or video in some way. It provides more search value to our viewers in less time than simply re-stating the speech.

Adobe Podcast — for vocal enhancement

We have used Adobe Podcast to enhance the audio quality of the “Scrub of the Year Award”, which was a segment in the 2011 /v/GAs that featured The Best Gamers. The improvements over the original are significant. It doesn’t come without artifacting, but Adobe Enhance is a remarkable tool to salvage a recording with a “bad mic”.

Microsoft Azure Speech Studio/OpenAI Whisper— for captioning

The 2022 Vidya Gaem Awards are captioned using Azure Cognitive Services Speech-To-Text Captions.

Since YouTube did away with its old captioning system circa 2019 as part of the Material Redesign, editing captions became significantly more challenging. By that time, I had captioned four shows myself. They were the 2011 show, 2017, 2018, and 2019. I made the decision to add captions to virtually all of our productions to increase the accessibility of our content and visibility (in the sense that the words spoken are easily found). We use Microsoft Azure Speech Studio for offline captions.

Why not use YouTube’s auto-generated captions? For particularly long videos, such as the Vidya Gaem Awards (that sometimes approach 2 hours in length), auto-generated subtitles aren’t available.

Captions are a great thing. They increase the readability of the visual presentation for people who can’t hear or play sound (e.g: they’re in church, or at work). However, it’s important to remember that these auto-generated captions made by Azure Cognitive Services have an error rate pretty similar to YouTube’s. However, it’s at least a starting point to make improvements.

Update 5/21/23: We have used OpenAI’s Whisper on our 2022 Award videos and noted a higher accuracy rate, so going forward we’ll probably use that model.

Things We Don’t Use AI On

Now that you have five in-depth examples of how we use AI, what are some things we don’t use AI for, and don’t plan to?

  • AI isn’t used to run the show or make decisions for us. Generative AI is still a nascent, ever-evolving tool that requires human prompting and judgment as to whether it‘s worth incorporating in our production. Our leadership is responsible for its responsible use and inclusion in our show, and our intent is to never mislead others about when and where AI is being used.
  • We have not trained AI to mimic the style of our writing. Currently, when prompted, AIs fall back on the writing style of speeches given at conventional award shows “I would like to thank the academy…” and not from the perspective-driven style of our hosts reflecting on our voters’ taste.
  • We have not trained AI on nomination/voter data to predict a winner. Individual voter choices are one of the few things we consider “secret” throughout productions. They are handled in line with our privacy policy, and we do not disclose them to third parties. Aggregate voter choices are public via pairwise tables, but we haven’t trained AI on this either and don’t plan to.
  • We have not used AI to write our award speeches, or for nominee “flavor text” on voting pages. Human interaction is just too important for these, and the “knowledge cutoff” of these large language models would result in a high amount of hallucinations.

Hopefully, this article gave you a better insight about how we use AI in the Vidya Gaem Awards, an award show made by the users of 4chan’s /v/ — Video Games board. We’ve been making this award show since 2011, and it’s been a blessing for me to grow up with this organization and see it evolve into what it is today. It is a community award show made first and foremost by volunteers, and for the local enjoyment of people who browse and use /v/.

This text of this article was written entirely by a human.

Disclosure: I am the current Head of Outreach and the former Executive Producer of the Vidya Gaem Awards (2013–2015, 2017).

--

--