intro for project meilin and linne platform
Post on 16-Apr-2017
288 Views
Preview:
TRANSCRIPT
Virtual singer & LINNE
CC-BY-NC
Slide author
(Chou Shouichi)/ MGdesigner
Paul Liu and me organize
Wikimedia.tw: member of board of directors
(and direct tech development )A programmer
A musician (Jazz ukulele, DTM)
Shoichi.chou@gmail.com
Everyone knows her
Powered by
Yamaha Vocaloid2 engine
So
Why a FOSS 'v'ocaloid?
If you buy an instrument
You can play any song,Do anything.
play
By teeth
Break !
burn
In any Vocaloid product EULA
You didn't get whole rightsno anti-society works
(so,What works are anti-society?)Trademarks protection (images,
keywords)
(ex: 'Vocaloid' ,'','''s image)
No using Miku images=not popular
musicians are controlled
No freedom
Be ruled
Using a Gibson guitar,you are its master.
Using Vocaloid products, You are their slave.
INDIE DIE
UTAU
A free vocaloid-like
DIY avocaloid
Programs: editor(frontend)+resampler+wavtool
data: vocal DB - oto.ini + wav samples
Vocal DB is an open spec ,many people DIY
vocaloid programs working flow
Editor: compose the melody(many notes)
Resampler: modulate a sample to Specified pitch,or other parameters (velocity...).
Wavtool: combine these modulated wavs
Finally,we get a song vocal wav file,and mix into a song
but
Charge no fee,not freedom
Default resampler work badly
DB bad international support (S-JIS)
Oto.ini no implementing ini comments ;
UTAU always auto sort oto.ini (hard collaboration)
Hard UI control
Not open source
Its development is very private
And you know ...Yamaha owns many superpatents
A nice free vocaloid IsVery difficult
During 2011-2012
One day, Paul Liu talked to me
New Algorithm, 'World' better than Vocaloid2
Author: Doctor
Patent free
EFB-GW(Synthesizer) for UTAU
Open source(old version GPL,newer is BSD)
https://github.com/mmorise/World
During Dec,Dr.'ll do another great upgrade
How good is World algorithm?
very awesome 'autotune'
(original official test is a realtime Karaoke autotune for s. )Modulate a sample to any pitch without distortion (Keep F0 well)
(Vocaloid2 can't ,so Miku need 3 different range versions of each sample)Very fast ,no need to pre-preapre frequency tables
(Just do it real time)If X86, Even works good on older machines(maybe on ARM)
Ok Let's do it!
Finally
we made her...
Listen...
Hear MAMA cover
:Ancient Chinese,JapanesePentatonic scale note. (Do Re Mi Sol La)
Also means We 'recruit' a voice actor(and also a Jazz singer) from Internet
Merlin(super wizard)Linux
http://projectmeilin.github.io/
Project Meilin Features
CC-BY
Utau compatible
Professional recording(in studio)
Src:24bits 48000hz wavs
VCV VC
(V - Vowel c - Consonant)Recorded: Japanese,Mandarin(Taiwan style)
How good? A test
Commercial Miku VS. open content Meilin
V2 Miku each sample recorded high,middle,low versions
VS.
Meilin each sample just record 1 version.
Listen to the comparing video( song: , Start from 0:44)Especially check super low pitch and super high pitch if is distorted?
fact
Miku DB:1gb+
Only Japanese
Meilin DB:627mb
Japanese+Mardarin
Mardarlin DB is 3 of JP DB
thank to Dr.
Without his effort and kindness, a good FOSS virtual singer is imposible
2 moreSpecial features
1: 14 Special effects
Defined in oto.ini3 breath : br1,br2,br3 ( ex:Miku only have these breath. )
Spanish 'R' rolling: trill
Cough: cough
Cry,dry tears:drytears
Blownose: blownose
Sucking: suck
sigh():sgn1,sgn2,sgn3,sgn4
Whistle :whsl
clean throat: clnt
2: possible
EX:'' in (video)
in Mandarin ,there is the same 'u'
Just borrow what we recorded.
also can borrow other Mandarin samples for synthesizing or some foreign languages.
(ex: 1 or 2 foreign lyrics in a Japanese song)
'v'ocaloid also can do speech synthesis
Better than traditional speech synthesis
Accent(= pitch,velocity,rhythm,speed) controllable
Could do many emotion(melody lines) : cry,angry...
TTS,story telling,emotional '' possible
Some tests which I have done by Miku: 1,2,3,4 based on my scale algorithm. 'Auto render' possible,but.
If use Vocaloid to do this,you need to beg YAMAHA for opening API. But our software stack are open source. She could do more than singing.
How made?
Recorded in a pro studio
Thanks to sponsor (Aguai),my master
(A famous POP song producer in TW.)
About the vocal
Her name is (Lo Chu).
We choose her voice from 20 girls from on internet.
She is a singer in a JAZZ / anime cover song band.
Also vocal acting trained.
Japanese accent not bad.
Japanese friend ATsushi
But very hard work
Japanese recording need 3~4 hours.ButIntact Madarin(possibility on math ,then minus repeated samples by Phonology)
Madarin recording needs days.
The final day
LINNE platform
We defined the FOSS 'v'ocaloid stack
Of course opensource
Compatible with Utau DB (but UTF-8)
resampler+wavtool+editor(interface)+DB -making tools
May include 'hardware'
Hardware Ex: Doll robot
Our Oto.ini DB spec
You can use ';' for comments
Editors programs shouldn't resort the file
UTF-8
IPA based (International Phonetic Alphabet)
By IPA,Different languages could use common pronunciation samples
(no more re-recording again, keep the DB size smaller, more storage efficiency )
Engine (now is xvsqExec ,may need to be changed)
JcadenciiLinne-editor (in dev)(song editor,front end)
Wavtool-pl(GPL wavtool)tn_fnds_yc (gpl)(resampler,EFB-GW variant )
World libOther programs in the future
ex: linne-TTS
The chart may need evolution.
Problem now: the editor(frontend)
Cadencii is written by .net with binding too many Windows native calls
Jcadencii is very slow (Cadencii java port)
Upstream dev stopped. We also give it up.
Another open Utau frontend: http://fluidvocalsynth.weebly.com/ (also .Net)
linne-editor(frontend)
https://github.com/marty1885/linne-editor
In very earily development
fact
We don't have enough manpower about interface coding
When normal users edit, still need wine+Utau
Similar to early Linux dev in Minix >_
top related