ailia_voice package¶
Classes¶
- class ailia_voice.AiliaVoiceModel¶
Bases:
object
- class ailia_voice.G2P(env_id=-1, num_thread=0, memory_mode=11, flags=0)¶
Bases:
AiliaVoiceModelConstructor of ailia Voice model instance.
- Parameters:
env_id (int, optional, default: ENVIRONMENT_AUTO(-1)) –
environment id of ailia execution. To retrieve env_id value, use
ailia.get_environment_count() / ailia.get_environment() pair
- or
ailia.get_gpu_environment_id() .
num_thread (int, optional, default: MULTITHREAD_AUTO(0)) –
number of threads. valid values:
MULTITHREAD_AUTO=0 [means systems’s logical processor count], 1 to 32.
memory_mode (int, optional, default: 11 (reuse interstage)) – memory management mode of ailia execution. To retrieve memory_mode value, use ailia.get_memory_mode() .
flags (int, optional, default: AILIA_VOICE_FLAG_NONE) – Reserved
- __init__(env_id=-1, num_thread=0, memory_mode=11, flags=0)¶
- g2p(text, g2p_type)¶
Generates phonemes from text.
- Parameters:
text (string) – Input text
g2p_type (int) – Format of G2P. Specify with AILIA_VOICE_G2P_TYPE_GPT_SOVITS_*.
- initialize_model(model_path='./', user_dict_path=None)¶
Initialize and download the model.
- Parameters:
model_path (string, optional, default: "./") – Destination for saving the model file
user_dict_path (string, optional, default: None) – Specify the path of the user dictionary. The user dictionary is in mecab format.
- class ailia_voice.GPTSoVITS(env_id=-1, num_thread=0, memory_mode=11, flags=0)¶
Bases:
G2P- initialize_model(model_path='./', user_dict_path=None)¶
Initialize and download the model.
- Parameters:
model_path (string, optional, default: "./") – Destination for saving the model file.
user_dict_path (string, optional, default: None) – Specify the path of the user dictionary. The user dictionary is in mecab format.
- set_reference_audio(ref_text, g2p_type, audio_waveform, sampling_rate)¶
Specify the voice that will serve as the timbre for speech synthesis.
- Parameters:
ref_text (string,) – Text of the speech content in the audio PCM.
g2p_type (int) – Format of G2P. Specify with AILIA_VOICE_G2P_TYPE_GPT_SOVITS_*.
audio_waveform (np.ndarray) – PCM data, formatted as either (num_samples) or (channels, num_samples).
sampling_rate (int) – Sampling rate (Hz).
- synthesize_voice(text, g2p_type)¶
Synthesizes voice from input text.
- Parameters:
text (string) – Input text.
g2p_type (int) – Format of G2P. Specify with AILIA_VOICE_G2P_TYPE_GPT_SOVITS_*.
- Returns:
buf (np.ndarray) – PCM data, formatted as either (num_samples).
sampling_rate (int) – Sampling rate (Hz).