Great release but i need some clarifications!

#3
by DreamFilmVFX - opened

First of all, great team, and thanks for releasing these XL models. I did an initial test on both the turbo-xl and sft-xl models, and I don't know if I'm doing something wrong, but the generation for both models seems worse than the 2B models (even though the audio quality is actually better). Let me explain: I basically tried a prompt for a rock ballad that I had already saved from a previous generation and applied it 1:1 to the new models, and the vocal melody, chords, and harmonies seem like random puzzles, they don't follow a coherent melodic logic. I'd like to understand if I'm doing something wrong or if there's a problem with the models or gradio. Thanks again!

I think I encounter too a degradation with the current turbo xl version. I am using ComfyUI and I generated, for comparison, a song that previously came out great with the "old" (non xl) turbo safetensors. Using the same workflow (same prompt, same seeds, same text encoder models, same everything) the song generated when I replace ace turbo with ace turbo xl resembles 95% the old song but some parts are out of tune and very annoying because of that. It's like it's not able to play some instruments within the same scale with the rest of the melody. It sounds disharmonic (and "original" in a bad way).

Sign up or log in to comment