Discuss Scratch

Xzillox
Scratcher
1000+ posts

Male and Female 'singer' voices in the speech section

Za-Chary wrote:

sing [skibidi] at note (62 v) for (0.5) beats
I'm assuming the “beats” input changes the speed at which it is sung? Although that isn't exactly the most musically pleasing prospect, it seems like a pretty good way to implement melody. It doesn't need to be complex because low floor and all that, plus the other music blocks aren't complex.
I misunderstood as king of the page… shame, shame, shame

Last edited by Xzillox (March 22, 2024 04:45:08)

yadayadayadagoodbye
Scratcher
1000+ posts

Male and Female 'singer' voices in the speech section

Xzillox wrote:

Za-Chary wrote:

sing [skibidi] at note (62 v) for (0.5) beats
I'm assuming the “beats” input changes the speed at which it is sung? Although that isn't exactly the most musically pleasing prospect, it seems like a pretty good way to implement melody. It doesn't need to be complex because low floor and all that, plus the other music blocks aren't complex.
Beats likely refers to a beat, as described in sheet music (which in most common cases, refers to a quarter note, but can refer to other notes such as eigth notes when the time signature is to be changed)

If this was added, a “tempo” block for the TTS would likely be required, and so would a “time signature” block (though its also viable to simply have it stay on 4/4 times)
roofogato
Scratcher
1000+ posts

Male and Female 'singer' voices in the speech section

scratch vocaloid when

Ehh… I think this would be very buggy to implement. Unless there's a way to tune the voices, it will probably sound very buggy and weird, especially held at long notes. If a software/tool is added to allow tuning, it would probably be too complicated for Scratch just use vocaloid

And where would Scratch get the voices?
Xzillox
Scratcher
1000+ posts

Male and Female 'singer' voices in the speech section

yadayadayadagoodbye wrote:

Xzillox wrote:

Za-Chary wrote:

sing at note (62 v) for (0.5) beats
I'm assuming the “beats” input changes the speed at which it is sung? Although that isn't exactly the most musically pleasing prospect, it seems like a pretty good way to implement melody. It doesn't need to be complex because low floor and all that, plus the other music blocks aren't complex.
Beats likely refers to a beat, as described in sheet music (which in most common cases, refers to a quarter note, but can refer to other notes such as eigth notes when the time signature is to be changed)

If this was added, a “tempo” block for the TTS would likely be required, and so would a “time signature” block (though its also viable to simply have it stay on 4/4 times)
yeah I know about beats and stuff lol I'm a musician.

I guess to clarify what I meant was if it would play it slower or faster for the number of beats you input.
yadayadayadagoodbye
Scratcher
1000+ posts

Male and Female 'singer' voices in the speech section

Xzillox wrote:

yadayadayadagoodbye wrote:

snip-
yeah I know about beats and stuff lol I'm a musician.

I guess to clarify what I meant was if it would play it slower or faster for the number of beats you input.
I would assume it'd just work as straightfoward as it seems, which is, 1 beat would play for the duration of 1 beat in reference to the tempo, 0.5 beats would play for half that time, E.T.C
Xzillox
Scratcher
1000+ posts

Male and Female 'singer' voices in the speech section

yadayadayadagoodbye wrote:

Xzillox wrote:

yeah I know about beats and stuff lol I'm a musician.

I guess to clarify what I meant was if it would play it slower or faster for the number of beats you input.
I would assume it'd just work as straightfoward as it seems, which is, 1 beat would play for the duration of 1 beat in reference to the tempo, 0.5 beats would play for half that time, E.T.C
idk if we're on the same page here lol
basically I was just wondering how a different number of beats would affect how the singing sounds, since that would be kinda tricky with an actual voice. unless I'm misremembering how the music extension works.

edit: yeah I forgot how it works lmao. I thought it extended the note for the length of the beats. I'll edit my original post.

Last edited by Xzillox (March 22, 2024 04:44:09)

LP372
Scratcher
1000+ posts

Male and Female 'singer' voices in the speech section

Za-Chary wrote:

yadayadayadagoodbye wrote:

Because you'd be singing, and you'd need to know what notes you're singing

undeterministic wrote:

Suppose I sing Hi! in the note of E flat and I sing it in the note of E sharp, they sound different. How would it know what the melody is without some sheet music. Or other solution.
So why does there need to be sheet music as opposed to a block that just says the following?

sing [skibidi] at note (62 v) for (0.5) beats

Where does the sheet music come in?
Yeah, that might be useful
XCartooonX
Scratcher
500+ posts

Male and Female 'singer' voices in the speech section

gdfsgdfsgdfg wrote:

ok which engine will it use to create a “singer” voice
I assume it would be similar to Talk It! by Microsoft.
That, or the singing Macintosh voices included in its TTS program.

Last edited by XCartooonX (March 22, 2024 12:56:55)

cookieclickerer33
Scratcher
1000+ posts

Male and Female 'singer' voices in the speech section

Ai isn’t good enough to do stuff like this yet.
What you are seeing when ai “sings” a song is just it modifying someone else’ voice like a voice changer

Not sure if Amazon uses AI or phonetic samples but AFAIK using phonetic samples for this wouldn’t work

Last edited by cookieclickerer33 (March 22, 2024 13:29:08)

MythosLore
Scratcher
1000+ posts

Male and Female 'singer' voices in the speech section

yadayadayadagoodbye wrote:

Xzillox wrote:

Za-Chary wrote:

sing [skibidi] at note (62 v) for (0.5) beats
I'm assuming the “beats” input changes the speed at which it is sung? Although that isn't exactly the most musically pleasing prospect, it seems like a pretty good way to implement melody. It doesn't need to be complex because low floor and all that, plus the other music blocks aren't complex.
Beats likely refers to a beat, as described in sheet music (which in most common cases, refers to a quarter note, but can refer to other notes such as eigth notes when the time signature is to be changed)

If this was added, a “tempo” block for the TTS would likely be required, and so would a “time signature” block (though its also viable to simply have it stay on 4/4 times)
Why would a “time signature” block be needed? One beat in 4/4 is the same as one beat in 3/4.
WallydogChoppychop
Scratcher
500+ posts

Male and Female 'singer' voices in the speech section

You mean you want this?
play scream (scream) for (0.25) beats
j.k
but I don't think this is a good idea as you should probably just record some singing using some ai voice generator
Xzillox
Scratcher
1000+ posts

Male and Female 'singer' voices in the speech section

hang on a second-
Why not just record yourself singing? Even if you're a bad singer, it probably won't be much worse or probably better than whatever the text-to-singing comes up with.
MythosLore
Scratcher
1000+ posts

Male and Female 'singer' voices in the speech section

Xzillox wrote:

hang on a second-
Why not just record yourself singing? Even if you're a bad singer, it probably won't be much worse or probably better than whatever the text-to-singing comes up with.
What if you don’t want to share your voice on Scratch, or what if your voice is a baritone and you want a soprano voice to sing your song?
roofogato
Scratcher
1000+ posts

Male and Female 'singer' voices in the speech section

MythosLore wrote:

Xzillox wrote:

hang on a second-
Why not just record yourself singing? Even if you're a bad singer, it probably won't be much worse or probably better than whatever the text-to-singing comes up with.
What if you don’t want to share your voice on Scratch, or what if your voice is a baritone and you want a soprano voice to sing your song?
https://scratch-mit-edu.ezproxyberklee.flo.org/discuss/11/
Za-Chary
Scratcher
1000+ posts

Male and Female 'singer' voices in the speech section

Xzillox wrote:

hang on a second-
Why not just record yourself singing? Even if you're a bad singer, it probably won't be much worse or probably better than whatever the text-to-singing comes up with.

roofogato wrote:

https://scratch-mit-edu.ezproxyberklee.flo.org/discuss/11/
If we assume that this suggestion were implemented and worked perfectly, it would be much easier to use than either of these solutions.
LP372
Scratcher
1000+ posts

Male and Female 'singer' voices in the speech section

Xzillox wrote:

hang on a second-
Why not just record yourself singing? Even if you're a bad singer, it probably won't be much worse or probably better than whatever the text-to-singing comes up with.
i don't want to reveal my voice
Xzillox
Scratcher
1000+ posts

Male and Female 'singer' voices in the speech section

Za-Chary wrote:

Xzillox wrote:

hang on a second-
Why not just record yourself singing? Even if you're a bad singer, it probably won't be much worse or probably better than whatever the text-to-singing comes up with.

roofogato wrote:

https://scratch-mit-edu.ezproxyberklee.flo.org/discuss/11/
If we assume that this suggestion were implemented and worked perfectly, it would be much easier to use than either of these solutions.
True, but that's a pretty big assumption.
LP372
Scratcher
1000+ posts

Male and Female 'singer' voices in the speech section

Xzillox wrote:

Za-Chary wrote:

Xzillox wrote:

hang on a second-
Why not just record yourself singing? Even if you're a bad singer, it probably won't be much worse or probably better than whatever the text-to-singing comes up with.

roofogato wrote:

https://scratch-mit-edu.ezproxyberklee.flo.org/discuss/11/
If we assume that this suggestion were implemented and worked perfectly, it would be much easier to use than either of these solutions.
True, but that's a pretty big assumption.
yeah, but ST have not tried it yet
gilbert_given_189
Scratcher
1000+ posts

Male and Female 'singer' voices in the speech section

I'm not sure if the speech API on AWS allows tonal input, but even if they do, I see problems on this suggestion:

  1. When you run a speech block with new configurations for the first time, it takes a while before the speech is played. That's because the blocks gets the speech audio on the fly, then caches it for later. This could impose a problem for something where timing is absolutely necessary.
    We could mitigate this by getting the speeches before it's being run, but I can't imagine the amount of unused requests of speeches from this…
  2. How would we handle multi-syllable words, that has different tones on different syllables? Do we:
    a. try to sound them:
    sing [day] at note (78 v) for (1) beats :: extension
    sing [zee] at note (75 v) for (1) beats :: extension // not *see, since the <s> in <Daisy> is voiced (spoken like Z)
    sing [day] at note (71 v) for (1) beats :: extension
    sing [zee] at note (66 v) for (1) beats :: extension // ditto
    sing [give] at note (68 v) for (0.333) beats :: extension
    sing [me] at note (70 v) for (0.333) beats :: extension
    sing [your] at note (71 v) for (0.333) beats :: extension
    sing [en] at note (68 v) for (0.666) beats :: extension // en [ˈɛn] is close enough to the first syllable of answer [ˈænsɚ]
    sing [sir] at note (71 v) for (0.333) beats :: extension // so is [sˈɜː]
    sing [do] at note (66 v) for (2) beats :: extension
    b. use hyphens to notate that the word is split into different syllables (this also means there's a “lookahead” between blocks)
    sing [dai-] at note (78 v) for (1) beats :: extension
    sing [-sy] at note (75 v) for (1) beats :: extension
    sing [dai-] at note (71 v) for (1) beats :: extension
    sing [-sy] at note (66 v) for (1) beats :: extension
    sing [give] at note (68 v) for (0.333) beats :: extension
    sing [me] at note (70 v) for (0.333) beats :: extension
    sing [your] at note (71 v) for (0.333) beats :: extension
    sing [an-] at note (68 v) for (0.666) beats :: extension
    sing [-swer] at note (71 v) for (0.333) beats :: extension
    sing [do] at note (66 v) for (2) beats :: extension
    c. Defenestrate this English spelling nonsense and use something like X-SAMPA or the IPA for the lyrics
    sing [d'eI] at note (78 v) for (1) beats :: extension
    sing [zi] at note (75 v) for (1) beats :: extension
    sing [d'eI] at note (71 v) for (1) beats :: extension
    sing [zi] at note (66 v) for (1) beats :: extension
    sing [g'Iv] at note (68 v) for (0.333) beats :: extension
    sing [m'i:] at note (70 v) for (0.333) beats :: extension
    sing [j'U@] at note (71 v) for (0.333) beats :: extension
    sing [\{n] at note (68 v) for (0.666) beats :: extension
    sing [s3] at note (71 v) for (0.333) beats :: extension
    sing [d'u:] at note (66 v) for (2) beats :: extension
  3. How would we handle edge cases like “singing” a sentence or two over a short period of time?
    sing [A nutshell is the outer shell of a nut.] at note (127 v) for (0.001) beats :: extension
    Or singing for a very long time?
    sing [aaaaaa] at note (0 v) for (99999) beats :: extension

Last edited by gilbert_given_189 (May 2, 2024 03:25:56)

gilbert_given_189
Scratcher
1000+ posts

Male and Female 'singer' voices in the speech section

bump

Powered by DjangoBB