[ home / all / search / radio / downloads ] [ leek / negi / mmd / live / c ] [ g / j / meta ] [ login / status ]

/leek/ - Vocaloid Lounge

Allegro, agitate

Name
Options
Comment
Emotes
Show Emotes


File Settings
Verification
File
Embed
Password (For file deletion.)



[Return] [Catalog] [Bottom]


 No.51

>What is DiffSinger
DiffSinger is an AI vocal synthesizer built for both speech and singing. In a nutshell it's like a Synthesizer V AI voicebank but user-generated which can be loaded in OpenUTAU.

Examples of VBs which have received the DiffSinger treatment include Ruby, Namine Ritsu and Yokune Ruko.

>Why this thread?

There's very few places online to discuss DiffSinger, and not many people know how to use it, nuff said. I'm not an expert on DiffSinger either but since this is a place to discuss all things Vocaloid it doesn't hurt to make a support / troubleshooting thread here.

Useful Resources:
https://github.com/MoonInTheRiver/DiffSinger
https://github.com/xunmengshe/OpenUtau/releases
https://github.com/MLo7Ghinsan/DiffSinger_colab_notebook_MLo7
https://sites.google.com/view/haru0l/diffsinger (Voicebanks)

Tutorials:
https://www.youtube.com/watch?v=Sxt11TAflV0
https://www.youtube.com/watch?v=TKZvOLhBQWM (Spanish)

 No.355

anons i found a training guide but im stuck.
https://docs.google.com/document/d/1uMsepxbdUW65PfIWL1pt2OM6ZKa5ybTTJOpZ733Ht6s
i was able to create the model but the only characters that work are AEIOU and they sound like a demon being summoned.

 No.2018

Recently found out there's a wiki for Diffsinger and it links a bunch of useful resources, go check it out its pretty cool
https://diffsinger.miraheze.org/

 No.2038

hard to get a diffsinger thread going when only 10% of the community knows it exists. hopefully menhera will give it more recognition if she ever makes it past the concept stage.

 No.2052

Does anyone have Ashera Lyre's voicebank? I tried to get it from OP's voicebank link but the website for Ashera's VB is down.

 No.2053

>>2052
She's discontinued, good luck finding an archived version because no one bothered to archive the page on the wayback machine

 No.2054

>>2053
Forgive me i just found out she's part of the Lunai Project and that site IS archived so here you go anon.

https://mega.nz/folder/LSZG0JYQ#ppsXEDgRCdAqXprW2NmMqA

 No.2119

Are there any English video tutorials on making DS banks? The first one is too vague

 No.2154

File: 1741208627698.gif(127.51 KB, 210x226, neeteto.gif)

You guys are our only hope right now, we at the "fake a SV" thread need you >>1270

 No.6016

Bumping just to ask if anybody knows a decent colab for training

 No.6400

Bump. I need a good tutorial on making a voicebank, preferably with detailed steps.

 No.6659

Does no one here know how to use diffsinger?

 No.6671

>>6659
I know I don’t
I’ll learn when TORIKO DS releases

 No.7773

Once again, we might need you guys

 No.7774

>>7773
Anon... no one here knows how to make DS banks. Why do you think this thread is so dead compared to the others?.

 No.10028

>>7774
The thing with DiffSinger is that a lot of tutorials and information are gatekept on dicksword servers. People here either don't use dicksword or don't like joining public servers.

 No.10029

>>10028
There's a guide linked ITT but it's kind of a pain in the ass to follow

 No.14210

So now that we have 1 (one) person who knows how to use the damn thing can we revive the thread?

 No.14237

Are we ever getting another LEEKA on DS?

 No.14238

>>14237
Maybe Sena next year. A while back I made a RVC of her but it lacked data so it sounded off.

 No.15252

Does anybody have an easy to understand tutorial on how this thing works and all the tools I need? The linked ones are all pretty vague + likely outdated and I'm not joining a dicksword server just to learn how to train a voicebank that I won't even release in the first place

 No.15253

>>15252
Also if we can get a Rentry that'd be nice

 No.15255

>>15252
Seconding

 No.15269

>>15252
there isnt. in order to make toriko i had to have someone walk me through the process step by step over text.

 No.15270

>>15269
sounds tedious

 No.15284

>>15270
it is. if anyone needs help, i can try, but only if youre using difftrainer for local training. even then, my knowledge is limited. if you dont have a beefy computer, im sorry.

 No.22579

How much would you have to record to have a DiffSinger that sounds close in quality to some of the biggest AI synths? I heard that ACE Studio recommends at least 30-100 minutes of audio and while DiffSinger isn't ACE I wonder if it still applies.

 No.22582

>>22579
One of the guides I read said past 2 hours of data doesn't really do much anymore and can even worsen the quality, so I guess that's the cap

 No.22583

>>22582
60m is already 1 hour and 100m is 1 hour with 40 minutes so anything beyond that is probably too much.

 No.22584

>>15284
What amount/GB of vram do you need for it?

 No.22589

>>22579
Synth V supposedly uses 50m

 No.22597

>>2154
Look how far we've come from the days of begging the DS support thread for help, lol

 No.22598

>>22584
I would also like to know because my other option if I don't have enough is to pay Google and train via Colab

 No.22603

>>22597
I think VIP is still the only person here who knows how to make a DiffSinger

 No.22648

>>22584
>>22598
ive fielded the question about vram to someone who knows. if you are unable to train your diffsinger for whatever reason, i can help you out. i will try to write and/or record a guide for using difftrainer at some point in the future. i dont know when, it might take a while. i am a slow worker.

 No.22659

>>22648
It's okay! Take as long as you need! As long as we eventually get one it would be super cool! Even if it took like. a year or more lmao

 No.22676

you need at least 6 gb of vram, and to set the batch size lower to train. the biggest bottleneck for a lot of people is ram apparently. i myself have 16 gigs and have had no problem training toriko, save for user error. the real thing you need to worry about is nvidia cuda compatibilty. if you dont have cuda cores, youre fucked and have to train it via colab or ask someone else to train it locally themselves for you.

 No.24518

>>22676
I have 4gb vram so fuck me I guess. Colab is my only hope.

 No.24537

>>24518
are you ena's vp?

 No.24538


 No.24606

I saw that some pages recommending that sample songs don’t exceed 140bpm, probably to make sure there’s sufficient data in each phoneme. Does this hurt the model’s ability to sing fast songs though?

If I remember correctly TORIKO’s 1.1 update improved her ability to sing fast, and I wonder if that was due to increased high-BPM data or improved labeling.

 No.25287

How long does it generally take you guys to render shit in DS?

 No.25289

>>24606
Don't know about 1.1 but 1.2 can sing fast songs for the most part even if they exceed 140BPM.

 No.25584

Hello, I am stupid, once you download a vocoder how do you set it in the diffsinger's dsconfig.yaml? do I just put the file name (and extension)?

please ELI5 I have 2 brain cells

 No.25585

>>25584
Open the config and if it should say something like nsf_hifigan and you just replace that with the name of the oudep you downloaded

 No.25598

>>25584
open it in a text editor and just change the vocoder listed there

 No.25599

>>25598
without the extension btw

 No.25949

I found what seems to be an auto labeler and a label checker.
https://github.com/spicytigermeat/labelmakr
https://github.com/spicytigermeat/labbu
Anyone tried these?

 No.25950

>>25949
They make a label checker and name it Labubu, gg

 No.25952

>>25949
The checker looks helpful. Maybe auto label generation is helpful to start, but I’d heavily verify the output. I know moresampler’s CV auto-oto is wonky and it should be using the same techniques.

 No.25954

>>25949
wish there were instructions on how to use it, this and difftrainer

 No.25999

>>25950
Labbu*



[Return] [Catalog] [Top][Post a Reply]

Delete Post [ ]
[ home / all / search / radio / downloads ] [ leek / negi / mmd / live / c ] [ g / j / meta ] [ login / status ]