Testing a game idea: an audio only game

checkker

Newbie
Apr 4, 2021
39
55
I have an idea for a game and I would like to see if that's something others would like too, or if it's just me wanting something that nobody wants. I would like to keep the technical feasibility out of the discussion. I am already very aware of the challenges to make such a game, and the current hardware limitations.

So basically, the idea is to make an audio only game where you can talk to characters like if they are real with your microphone. There is a narrator with a different voice to describe the actions, and the game is like a sandbox, a bit like AI Dungeon. So, yes, it is using language models, text-to-speech and speech-to-text.

Except that the game is more discussion oriented. The narration doesn't go out of control and it's more about seduction, dirty talk, sex simulation, talking your way out of bad situations, etc.

You could create your own story by picking a personality for the girl, then picking your current relationship with her (stranger, couple, friend, etc.), then picking a situation or environment. Then the AI will do its job and generate the story.

First version would be mostly focused on one-on-one situations for simplicity, and mostly on the situations where consent to have sex is already initiated. Heavy focus on dirty talk.

Later versions would be more complex and more evolutive, where you can start as a stranger and create your story. There would be many world parameters affecting the difficulty to get what you want, like if you want realistic reactions, or if you want any girl you meet to be secretely craving for sex. Also, you could edit personalities overall and create a world where all girls are mean, sweet, dominant or submissive.

The gameplay is extremely simple: you just need headphones, a mic, and a mouse or keyboard to push and hold one button, mainly to swap between what you say and state the actions you do.

P.S. Right now it's just an idea. IS that something that people here would want to play, or is it just me?
 
Nov 21, 2020
72
196
If you have played AI Dungeon, then you know that the AI sometimes tends to become weird. It doesn't necessarily switch the entire subject, but it's unpredictable unless given very specific instructions. This may work well with fantasy stories where most things go and unexpected situations are usually exciting and interesting, but if you instead consider the type of game you want then you'll quickly realize why it's not going to turn out very well.
If you want heavy focus on talking, you're going to meet the limitations of AI very quickly. Although our current AIs are advanced, they aren't quite good enough to create stories on their own. Not to mention you would need to actually train the AI which would take a ton of time if nothing else.
On top of that, the voice will sound robotic and it will ruin any immersion along with the inconsistencies in the story.
Hell, even AIs that you can talk online to can still give weird responses to your questions despite the massive amount of training they've undergone.

There's also the issue with speech recognition but that's not that big of a deal compared to the AI.
 
  • Like
Reactions: Hagatagar

Hagatagar

Well-Known Member
Oct 11, 2019
1,006
2,975
[...] where you can talk to characters like if they are real with your microphone. [...] Heavy focus on dirty talk.
To be honest, I would feel weird doing that. :unsure:

So, yes, it is using language models, text-to-speech and speech-to-text.
It might be nice at the beginning, but as PrinceOfShades already wrote, it might break the immersion.
It also can get dull pretty quick.

P.S. Right now it's just an idea.
On the whole, it sounds like a huge project. Maintaining a good AI costs a lot of money, and doing it all reasonably is extremely time and resource intensive. For something like that you have to invest a lot.

Look at how complicated AI Dungeon or NovelAi is, and they are only using written AI content.
 
  • Like
Reactions: PrinceOfShades

checkker

Newbie
Apr 4, 2021
39
55
Except for @Hagatarar saying he would feel weird doing it, everything else was technical.

I am very aware of all the challenges and limitations. But, maybe I have digged into that much deeper than you may think, and I already realized that you need something better than Tacotron 2 for the voice, and that GPT is not really usable for this because it gets out of control. Please let's keep the technical aspect out of the discussion.

What I really want to know is if this is a cool idea or a stupid, very niche idea of something nobody wants.
 
Nov 21, 2020
72
196
I personally wouldn't play it. Sound isn't something that interests me that much in adult games. It improves my experience if it comes along with visuals but a game solely based on verbal speech isn't my thing.
I'm sure there is a niche audience for it and some people would love the idea, but I'm not sure the time/effort/money investment is worth it unless it's a passion of yours.
 

anne O'nymous

I'm not grumpy, I'm just coded that way.
Modder
Donor
Respected User
Jun 10, 2017
10,302
15,172
Except for @Hagatarar saying he would feel weird doing it, everything else was technical.
No, the speech recognition isn't a technical problem, it's a practical one ; more than half of your players will not be native English speakers. You don't just have to deal with the accents (technical), but also with the poor pronunciation (practical). The worse being the pure mispronunciation ; when the word is shared by English and the native language, many struggle with the pronunciation and end with something in between.

Same for the AI limitations, that is as much technical than it is practical, since it will lead to a lack of coherence and/or constancy of the story. What will then lead to players not being interested, because there isn't much to keep them hooked, except the novelty of the concept.
 
  • Like
Reactions: PrinceOfShades

F4C430

Active Member
Dec 4, 2018
649
722
I would like to see you try it if only because it's a step toward making adult gaming more accessible to people with disabilities. I personally have no desire to play such a game though since i don't like talking to machines and i don't have audio privacy where i live.
 

checkker

Newbie
Apr 4, 2021
39
55
No, the speech recognition isn't a technical problem, it's a practical one ; more than half of your players will not be native English speakers. You don't just have to deal with the accents (technical), but also with the poor pronunciation (practical). The worse being the pure mispronunciation ; when the word is shared by English and the native language, many struggle with the pronunciation and end with something in between.
I get that part. Alternatively, choices can be pre-computed and the player could either just say the number linked to the choice, or click on the choice (which would make it audio-only capable instead of exclusively.) Anyway that's a way to make it work with a single language. So, let's say it can be both text and audio, since it has to go to text anyway.

Same for the AI limitations, that is as much technical than it is practical, since it will lead to a lack of coherence and/or constancy of the story. What will then lead to players not being interested, because there isn't much to keep them hooked, except the novelty of the concept.
There's a lot of paradigms here without information about the architecture, which models I plan to use, how stories are generated and contained, etc. I never said that it would be an AI dungeon clone, I just mentioned AI dungeon to explain that it would use AI and you would have a feeling of freedom, like you can just say anything you think about and the character will respond to that accordingly. How that could be achieved is a different matter.
 

checkker

Newbie
Apr 4, 2021
39
55
Just to address the robotic voice concern:

This is what you can expect today, or even better because the data used in this video is small and I have heard better synthetic voices.
 

checkker

Newbie
Apr 4, 2021
39
55
Here's another sample using a different model. It's not mine, but someone extracted the narrator voice from Darkest Dungeon and provided a sample of his text-to-speech:
 

anne O'nymous

I'm not grumpy, I'm just coded that way.
Modder
Donor
Respected User
Jun 10, 2017
10,302
15,172
There's a lot of paradigms here without information about the architecture, which models I plan to use, how stories are generated and contained, etc.
Are you sure that "paradigm" is the word you wanted to use ?
I mean, there's part of my comment that can be seen as assumption, while the only part that can possibly be seen as a paradigm is that stories need to be coherent and have some constancy.


I never said that it would be an AI dungeon clone, [...]
Nor it was said by others.
PrinceOfShades argumentation wasn't to say that AI dungeon is bad, but that actual AI technology is too limited for what you want to achieve. I assume that he used it as example because you named it, and so would have a better understanding of the problem he was addressing. As it is actually, the use of an AI for the generation of the content, whatever the story or just a part of the dialog, would lead to the practical issue I named. And you explicitly said that both would be generated by an AI.
 
  • Like
Reactions: PrinceOfShades

checkker

Newbie
Apr 4, 2021
39
55
Are you sure that "paradigm" is the word you wanted to use ?
The paradigm is that when AI is mentioned, people now think that it has to be "machine learning centric", like if AI automatically means that it's going to be a showcase of the latest BERT or GPT library.

That paradigm is fed with products like AI Dungeon and NovelAI, where the encoder is the driver and code is built around it to fix the issues. If you want to generate text or stories, it's ok, but if the goal is to make a game AI or a game centered on AI, it's a huge mistake.

That's why I asked in my original post to please focus the discussion on the game concept and not its feasibility. All current technical argumentation is assuming that I'm going to center the project on current ML models and its limitations, which is not the case.
 

anne O'nymous

I'm not grumpy, I'm just coded that way.
Modder
Donor
Respected User
Jun 10, 2017
10,302
15,172
The paradigm is that when AI is mentioned, people now think that it has to be "machine learning centric", like if AI automatically means that it's going to be a showcase of the latest BERT or GPT library.
So I was right. Using a neuronal AI is following a paradigm, basing an answer on the fact that you'll use a neuronal AI is, at best, an assumption.
By the way, I don't see how answering to "modern AI still can't produce coherent stories with constancy", with "I'll not use a modern AI", can make the argument less significant. At best you also confused "Artificial Intelligence" with "decision tree".


That's why I asked in my original post to please focus the discussion on the game concept and not its feasibility.
And once again, it's what was done.
As I implied above, if you based your concept around a decision tree, then the issue would only be a technical one ; how goodly, or not, you design the said decision tree. But here, you based your concept around an AI, what make the concept itself being faulty.
 
  • Like
Reactions: PrinceOfShades

checkker

Newbie
Apr 4, 2021
39
55
I would feel like a jackass dirty talking into the microphone to an AI lol.
Thanks for the honest reply.

As I implied above, if you based your concept around a decision tree, then the issue would only be a technical one ; how goodly, or not, you design the said decision tree. But here, you based your concept around an AI, what make the concept itself being faulty.
It's not a decision tree. It's not building directly around a ML model. Why is it so hard to just let go? You want me to completely expose in details the architecture I have in mind? Not gonna happen.

How many times in my life I have done prototypes of solutions that all my colleagues and even R&D at my job said it was impossible. Yet I did it. And some turned into a 8 figure business. Seriously, you just need to understand that we don't have the same information and I am not going to share the technicalities to that level of detail. SO, you can keep thinking you are right, with the little information you have.

Sorry, but you just sound like those people that kill others ideas before real thoughts and strategies are even put at work.
 

F4C430

Active Member
Dec 4, 2018
649
722
I wonder if you'd get more interest in this if you went mobile. It's far more natural for people to talk to mobile devices, plus they could take it somewhere that might have more privacy. Even if a third party did overhear the conversation, it's possible to be mistaken for a conversation with a real person which may put some users more at ease.
 
  • Thinking Face
Reactions: Hagatagar

checkker

Newbie
Apr 4, 2021
39
55
I wonder if you'd get more interest in this if you went mobile. It's far more natural for people to talk to mobile devices, plus they could take it somewhere that might have more privacy. Even if a third party did overhear the conversation, it's possible to be mistaken for a conversation with a real person which may put some users more at ease.
Not a bad idea, but it would require a server because the phones are a bit too slow for the state of the art TTS and STT. The question is: how comfortable would you be being connected to a server while doing such a thing. My intuition tells me that it wouldn't work for that reason.

Actually, the audio game is more a milestone to make something else which would require even more time: a character AI framework. I am actually trying to figure out if the character AI, limited to a specific scope, could become a standalone game. Slice the project into smaller chunks.

The main reason for the audio is this: immersion. You get to say what you want (not restricted by choices) and it's faster than writing. The game pace can be much faster because you don't have to read multiple options and you can react quickly. If you have the ability to do that, then it's possible to add time/pause as a factor during the game: "Why are you not answering to me?" for example. Making a pause or refuse to answer actually mean something. Give the ability to the character to interrupt you when you monologue, if they don't like what you said. Instead of analysing what you said only after you stop talking, do it with speech-to-text streaming, so the character can react quickly instead of that creepy 3-5 seconds waiting after you say something.

There is a way to control what the character says to send the story in a specific direction. There is the intent of the character, which can be overrided by the intent of the game master. This could be set by a pace metric. How long can you chit-chat until the game AI is working toward the goal. The game master is the AI controlling the story. It has multiple goals or milestones that must be reached, and it will manipulate the conversation to ensure that these goals are reached. A character can become angry, sad, or try to trick you to reach the goal. It's like making a linear story that you can play multiple times and it feels different each time, yet you are still directed somewhere. The game master already knows the end, so its trying to solve the current situation to get you there.

The goals can be soft or hard goals. Meaning that it's ok if some failed because the player didn't really want to go with the suggested flow. The hard goals are the ones that can't be failed, for example, if you absolutely need something to happen before reaching the next chapter. Like someone must die, or someone must take the amulet. Then, you have a backup plan to force it to happen (she stole it when you didn't look).

That character AI framework could fit in a scripted game like a modular block or a plugin, or you could build an entire game around it.
 

Hagatagar

Well-Known Member
Oct 11, 2019
1,006
2,975
I wonder if you'd get more interest in this if you went mobile. It's far more natural for people to talk to mobile devices, plus they could take it somewhere that might have more privacy. Even if a third party did overhear the conversation, it's possible to be mistaken for a conversation with a real person which may put some users more at ease.
This means iPhone users can finally get dirty with (fake)Siri. :sneaky:
 

F4C430

Active Member
Dec 4, 2018
649
722
Not a bad idea, but it would require a server because the phones are a bit too slow for the state of the art TTS and STT. The question is: how comfortable would you be being connected to a server while doing such a thing. My intuition tells me that it wouldn't work for that reason.
I agree that needing to be connected online is a big deal breaker, i wasn't aware of the tech requirements though.