Saw someone resurrected the /vsg/ thread so figured I’ll dump some of the information here as well in case anyone is interested.
TTS:
- Tortoise TTS: https://github.com/neonbjb/tortoise-tts
- Voice cloning TTS for Bark https://github.com/C0untFloyd/bark-gui
Voice Changer (based on RVC):
- Voice2Voice changer webui: https://github.com/Mangio621/Mangio-RVC-Fork
- Based on RVC webui https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI
Resources:
-
RVCv2 AI cover guide: https://docs.google.com/document/d/13_l1bd1Osgz7qlAZn-zhklCbHpVRk6bYOuAuB78qmsE/edit?usp=sharing
-
Real time voice changing using RVC models trained above: (needs more documentation) https://github.com/w-okada/voice-changer
-
How to create new RVCv2 model on Google colab plus dataset creation: https://docs.google.com/document/d/13ebnzmeEBc6uzYCMt-QVFQk-whVrK4zw8k7_Lw3Bv_A/edit?usp=sharing
-
Installing RVCv2 webui fork on local: https://docs.google.com/document/d/1KKKE7hoyGXMw-Lg0JWx16R8xz3OfxADjwEYJTqzDO1k/edit?usp=sharing
-
Complete list of publicly shared RVC voice models on huggingface https://huggingface.co/juuxn/RVCModels/tree/main
Personal thought:
RVC is apparently the big thing right now just by looking at numbers of available models https://docs.google.com/spreadsheets/d/1tAUaQrEHYgRsm1Lvrnj14HFHDwJWl0Bd9x0QePewNco/edit#gid=1227575351 Unfortunately RVC is strictly a voice changer so you either need to provide the source audio yourself or use TTS if you want to hear your favorite anime girl whispers into your ears.
I’m a bit surprised to the lack of interest for TTS but oh well, guess people nowadays are more into virtual idol singing songs and shit.
Here is a modified webUI for RVC that incorporates Micorsoft’s Edge TTS https://huggingface.co/spaces/ArkanDash/rvc-models-new. Check the Github link for the repo.
Here is my test run using the Shiroko model found in the Google Sheet:
Japanese https://shota.nu/amzdqmb6.wav
Japanese (-3 transpose) https://shota.nu/zlegqo1r.wav
English https://shota.nu/o8obck3a.wav