No.51
>What is DiffSingerDiffSinger is an AI vocal synthesizer built for both speech and singing. In a nutshell it's like a Synthesizer V AI voicebank but user-generated which can be loaded in OpenUTAU.
Examples of VBs which have received the DiffSinger treatment include Ruby, Namine Ritsu and Yokune Ruko.
>Why this thread?There's very few places online to discuss DiffSinger, and not many people know how to use it, nuff said. I'm not an expert on DiffSinger either but since this is a place to discuss all things Vocaloid it doesn't hurt to make a support / troubleshooting thread here.
Useful Resources:
https://github.com/MoonInTheRiver/DiffSingerhttps://github.com/xunmengshe/OpenUtau/releases https://github.com/MLo7Ghinsan/DiffSinger_colab_notebook_MLo7https://sites.google.com/view/haru0l/diffsinger (Voicebanks)
Tutorials:
https://www.youtube.com/watch?v=Sxt11TAflV0https://www.youtube.com/watch?v=TKZvOLhBQWM (Spanish)
No.355
anons i found a training guide but im stuck.
https://docs.google.com/document/d/1uMsepxbdUW65PfIWL1pt2OM6ZKa5ybTTJOpZ733Ht6si was able to create the model but the only characters that work are AEIOU and they sound like a demon being summoned.
No.2018
Recently found out there's a wiki for Diffsinger and it links a bunch of useful resources, go check it out its pretty cool
https://diffsinger.miraheze.org/ No.2038
hard to get a diffsinger thread going when only 10% of the community knows it exists. hopefully menhera will give it more recognition if she ever makes it past the concept stage.
No.2052
Does anyone have Ashera Lyre's voicebank? I tried to get it from OP's voicebank link but the website for Ashera's VB is down.
No.2053
>>2052She's discontinued, good luck finding an archived version because no one bothered to archive the page on the wayback machine
No.2119
Are there any English video tutorials on making DS banks? The first one is too vague
No.6016
Bumping just to ask if anybody knows a decent colab for training
No.6400
Bump. I need a good tutorial on making a voicebank, preferably with detailed steps.
No.6659
Does no one here know how to use diffsinger?
No.6671
>>6659I know I don’t
I’ll learn when TORIKO DS releases

No.7773
Once again, we might need you guys
No.7774
>>7773Anon... no one here knows how to make DS banks. Why do you think this thread is so dead compared to the others?.
No.10028
>>7774The thing with DiffSinger is that a lot of tutorials and information are gatekept on dicksword servers. People here either don't use dicksword or don't like joining public servers.
No.10029
>>10028There's a guide linked ITT but it's kind of a pain in the ass to follow
No.14210
So now that we have 1 (one) person who knows how to use the damn thing can we revive the thread?
No.14237
Are we ever getting another LEEKA on DS?
No.14238
>>14237Maybe Sena next year. A while back I made a RVC of her but it lacked data so it sounded off.
No.15252
Does anybody have an easy to understand tutorial on how this thing works and all the tools I need? The linked ones are all pretty vague + likely outdated and I'm not joining a dicksword server just to learn how to train a voicebank that I won't even release in the first place
No.15253
>>15252Also if we can get a Rentry that'd be nice
No.15269
>>15252there isnt. in order to make toriko i had to have someone walk me through the process step by step over text.
No.15284
>>15270it is. if anyone needs help, i can try, but only if youre using difftrainer for local training. even then, my knowledge is limited. if you dont have a beefy computer, im sorry.
No.22579
How much would you have to record to have a DiffSinger that sounds close in quality to some of the biggest AI synths? I heard that ACE Studio recommends at least 30-100 minutes of audio and while DiffSinger isn't ACE I wonder if it still applies.
No.22582
>>22579One of the guides I read said past 2 hours of data doesn't really do much anymore and can even worsen the quality, so I guess that's the cap
No.22583
>>2258260m is already 1 hour and 100m is 1 hour with 40 minutes so anything beyond that is probably too much.
No.22584
>>15284What amount/GB of vram do you need for it?
No.22589
>>22579Synth V supposedly uses 50m
No.22597
>>2154Look how far we've come from the days of begging the DS support thread for help, lol
No.22598
>>22584I would also like to know because my other option if I don't have enough is to pay Google and train via Colab
No.22603
>>22597I think VIP is still the only person here who knows how to make a DiffSinger
No.22648
>>22584>>22598ive fielded the question about vram to someone who knows. if you are unable to train your diffsinger for whatever reason, i can help you out. i will try to write and/or record a guide for using difftrainer at some point in the future. i dont know when, it might take a while. i am a slow worker.
No.22659
>>22648It's okay! Take as long as you need! As long as we eventually get one it would be super cool!

Even if it took like. a year or more lmao

No.22676
you need at least 6 gb of vram, and to set the batch size lower to train. the biggest bottleneck for a lot of people is ram apparently. i myself have 16 gigs and have had no problem training toriko, save for user error. the real thing you need to worry about is nvidia cuda compatibilty. if you dont have cuda cores, youre fucked and have to train it via colab or ask someone else to train it locally themselves for you.
No.24518
>>22676I have 4gb vram so fuck me I guess. Colab is my only hope.
No.24606
I saw that some pages recommending that sample songs don’t exceed 140bpm, probably to make sure there’s sufficient data in each phoneme. Does this hurt the model’s ability to sing fast songs though?
If I remember correctly TORIKO’s 1.1 update improved her ability to sing fast, and I wonder if that was due to increased high-BPM data or improved labeling.
No.25287
How long does it generally take you guys to render shit in DS?
No.25289
>>24606Don't know about 1.1 but 1.2 can sing fast songs for the most part even if they exceed 140BPM.
No.25584
Hello, I am stupid, once you download a vocoder how do you set it in the diffsinger's dsconfig.yaml? do I just put the file name (and extension)?
please ELI5 I have 2 brain cells
No.25585
>>25584Open the config and if it should say something like nsf_hifigan and you just replace that with the name of the oudep you downloaded
No.25598
>>25584open it in a text editor and just change the vocoder listed there
No.25599
>>25598without the extension btw
No.25950
>>25949They make a label checker and name it Labubu, gg
No.25952
>>25949The checker looks helpful. Maybe auto label generation is helpful to start, but I’d heavily verify the output. I know moresampler’s CV auto-oto is wonky and it should be using the same techniques.
No.25954
>>25949wish there were instructions on how to use it, this and difftrainer