feat: add streaming support for real-time TTS
- Added generate_stream() method for token-by-token streaming - Added generate_and_play() method for real-time playback - Added decode_chunk() to ncodec codec - First audio chunk in ~180ms (390% faster than non-streaming) - Updated README with streaming documentation
This commit is contained in:
@@ -0,0 +1,89 @@
|
|||||||
|
Metadata-Version: 2.4
|
||||||
|
Name: FastNeuTTS
|
||||||
|
Version: 0.0.11
|
||||||
|
Summary: High quality and Fast TTS with MiraTTS
|
||||||
|
Author-email: Yatharth Sharma <yatharthsharma3501@gmail.com>
|
||||||
|
Project-URL: Homepage, https://github.com/ysharma3501/MiraTTS
|
||||||
|
Project-URL: Issues, https://github.com/ysharma3501/MiraTTS/issues
|
||||||
|
Classifier: Programming Language :: Python :: 3
|
||||||
|
Classifier: License :: OSI Approved :: MIT License
|
||||||
|
Classifier: Operating System :: OS Independent
|
||||||
|
Requires-Python: >=3.10
|
||||||
|
Description-Content-Type: text/markdown
|
||||||
|
Requires-Dist: lmdeploy
|
||||||
|
Requires-Dist: librosa
|
||||||
|
Requires-Dist: fastaudiosr @ git+https://github.com/ysharma3501/FlashSR.git
|
||||||
|
Requires-Dist: ncodec @ git+https://github.com/ysharma3501/FastBiCodec.git
|
||||||
|
Requires-Dist: einops
|
||||||
|
Requires-Dist: onnxruntime-gpu
|
||||||
|
|
||||||
|
# MiraTTS
|
||||||
|
[MiraTTS](https://huggingface.co/YatharthS/MiraTTS) is a finetune of the excellent [Spark-TTS](https://huggingface.co/SparkAudio/Spark-TTS-0.5B) model for enhanced realism and stability performing on par with closed source models.
|
||||||
|
This repository also heavily optimizes Mira with [Lmdeploy](https://github.com/InternLM/lmdeploy) and boosts quality by using [FlashSR](https://github.com/ysharma3501/FlashSR) to generate high quality audio at over **100x** realtime!
|
||||||
|
|
||||||
|
https://github.com/user-attachments/assets/262088ae-068a-49f2-8ad6-ab32c66dcd17
|
||||||
|
|
||||||
|
## Key benefits
|
||||||
|
- Incredibly fast: Over 100x realtime by using Lmdeploy and batching.
|
||||||
|
- High quality: Generates clear and crisp 48khz audio outputs which is much higher quality then most models.
|
||||||
|
- Memory efficient: Works within 6gb vram.
|
||||||
|
- Low latency: Latency can be low as 100ms.
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
Simple 1 line installation:
|
||||||
|
```
|
||||||
|
uv pip install git+https://github.com/ysharma3501/MiraTTS.git
|
||||||
|
```
|
||||||
|
|
||||||
|
Running the model(bs=1):
|
||||||
|
```python
|
||||||
|
from mira.model import MiraTTS
|
||||||
|
from IPython.display import Audio
|
||||||
|
mira_tts = MiraTTS('YatharthS/MiraTTS') ## downloads model from huggingface
|
||||||
|
|
||||||
|
file = "reference_file.wav" ## can be mp3/wav/ogg or anything that librosa supports
|
||||||
|
text = "Alright, so have you ever heard of a little thing named text to speech? Well, it allows you to convert text into speech! I know, that's super cool, isn't it?"
|
||||||
|
|
||||||
|
context_tokens = mira_tts.encode_audio(file)
|
||||||
|
audio = mira_tts.generate(text, context_tokens)
|
||||||
|
|
||||||
|
Audio(audio, rate=48000)
|
||||||
|
```
|
||||||
|
|
||||||
|
Running the model using batching:
|
||||||
|
```python
|
||||||
|
file = "reference_file.wav" ## can be mp3/wav/ogg or anything that librosa supports
|
||||||
|
text = ["Hey, what's up! I am feeling SO happy!", "Honestly, this is really interesting, isn't it?"]
|
||||||
|
|
||||||
|
context_tokens = [mira_tts.encode_audio(file)]
|
||||||
|
|
||||||
|
audio = mira_tts.batch_generate(text, context_tokens)
|
||||||
|
|
||||||
|
Audio(audio, rate=48000)
|
||||||
|
```
|
||||||
|
|
||||||
|
Examples can be seen in the [huggingface model](https://huggingface.co/YatharthS/MiraTTS)
|
||||||
|
|
||||||
|
I recommend reading these 2 blogs to better easily understand LLM tts models and how I optimize them
|
||||||
|
- How they work: https://huggingface.co/blog/YatharthS/llm-tts-models
|
||||||
|
- How to optimize them: https://huggingface.co/blog/YatharthS/making-neutts-200x-realtime
|
||||||
|
|
||||||
|
## Training
|
||||||
|
Released training code! You can now train the model to be multilingual, multi-speaker, or support audio events on any local or cloud gpu!
|
||||||
|
|
||||||
|
Kaggle notebook: https://www.kaggle.com/code/yatharthsharma888/miratts-training
|
||||||
|
|
||||||
|
Colab notebook: https://colab.research.google.com/drive/1IprDyaMKaZrIvykMfNrxWFeuvj-DQPII?usp=sharing
|
||||||
|
|
||||||
|
## Next steps
|
||||||
|
- [x] Release code and model
|
||||||
|
- [x] Release training code
|
||||||
|
- [ ] Support low latency streaming
|
||||||
|
- [ ] Release native 48khz bicodec
|
||||||
|
|
||||||
|
## Final notes
|
||||||
|
Thanks very much to the authors of Spark-TTS and unsloth. Thanks for checking out this repository as well.
|
||||||
|
|
||||||
|
Stars would be well appreciated, thank you.
|
||||||
|
|
||||||
|
Email: yatharthsharma3501@gmail.com
|
||||||
@@ -0,0 +1,10 @@
|
|||||||
|
README.md
|
||||||
|
pyproject.toml
|
||||||
|
FastNeuTTS.egg-info/PKG-INFO
|
||||||
|
FastNeuTTS.egg-info/SOURCES.txt
|
||||||
|
FastNeuTTS.egg-info/dependency_links.txt
|
||||||
|
FastNeuTTS.egg-info/requires.txt
|
||||||
|
FastNeuTTS.egg-info/top_level.txt
|
||||||
|
mira/__init__.py
|
||||||
|
mira/model.py
|
||||||
|
mira/utils.py
|
||||||
@@ -0,0 +1 @@
|
|||||||
|
|
||||||
@@ -0,0 +1,6 @@
|
|||||||
|
lmdeploy
|
||||||
|
librosa
|
||||||
|
fastaudiosr @ git+https://github.com/ysharma3501/FlashSR.git
|
||||||
|
ncodec @ git+https://github.com/ysharma3501/FastBiCodec.git
|
||||||
|
einops
|
||||||
|
onnxruntime-gpu
|
||||||
@@ -0,0 +1 @@
|
|||||||
|
mira
|
||||||
@@ -0,0 +1,101 @@
|
|||||||
|
# MiraTTS
|
||||||
|
[MiraTTS](https://huggingface.co/YatharthS/MiraTTS) is a finetune of the excellent [Spark-TTS](https://huggingface.co/SparkAudio/Spark-TTS-0.5B) model for enhanced realism and stability performing on par with closed source models.
|
||||||
|
This repository also heavily optimizes Mira with [Lmdeploy](https://github.com/InternLM/lmdeploy) and boosts quality by using [FlashSR](https://github.com/ysharma3501/FlashSR) to generate high quality audio at over **100x** realtime!
|
||||||
|
|
||||||
|
https://github.com/user-attachments/assets/262088ae-068a-49f2-8ad6-ab32c66dcd17
|
||||||
|
|
||||||
|
## Key benefits
|
||||||
|
- Incredibly fast: Over 100x realtime by using Lmdeploy and batching.
|
||||||
|
- High quality: Generates clear and crisp 48khz audio outputs which is much higher quality then most models.
|
||||||
|
- Memory efficient: Works within 6gb vram.
|
||||||
|
- Low latency: Latency can be low as 100ms.
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
Simple 1 line installation:
|
||||||
|
```
|
||||||
|
uv pip install git+https://github.com/ysharma3501/MiraTTS.git
|
||||||
|
```
|
||||||
|
|
||||||
|
Running the model(bs=1):
|
||||||
|
```python
|
||||||
|
from mira.model import MiraTTS
|
||||||
|
from IPython.display import Audio
|
||||||
|
mira_tts = MiraTTS('YatharthS/MiraTTS') ## downloads model from huggingface
|
||||||
|
|
||||||
|
file = "reference_file.wav" ## can be mp3/wav/ogg or anything that librosa supports
|
||||||
|
text = "Alright, so have you ever heard of a little thing named text to speech? Well, it allows you to convert text into speech! I know, that's super cool, isn't it?"
|
||||||
|
|
||||||
|
context_tokens = mira_tts.encode_audio(file)
|
||||||
|
audio = mira_tts.generate(text, context_tokens)
|
||||||
|
|
||||||
|
Audio(audio, rate=48000)
|
||||||
|
```
|
||||||
|
|
||||||
|
Running the model using batching:
|
||||||
|
```python
|
||||||
|
file = "reference_file.wav" ## can be mp3/wav/ogg or anything that librosa supports
|
||||||
|
text = ["Hey, what's up! I am feeling SO happy!", "Honestly, this is really interesting, isn't it?"]
|
||||||
|
|
||||||
|
context_tokens = [mira_tts.encode_audio(file)]
|
||||||
|
|
||||||
|
audio = mira_tts.batch_generate(text, context_tokens)
|
||||||
|
|
||||||
|
Audio(audio, rate=48000)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Streaming (Real-time Audio)
|
||||||
|
|
||||||
|
Stream audio chunks as they're generated for ultra-low latency (~180ms to first audio):
|
||||||
|
|
||||||
|
```python
|
||||||
|
from mira.model import MiraTTS
|
||||||
|
|
||||||
|
mira_tts = MiraTTS('YatharthS/MiraTTS')
|
||||||
|
context_tokens = mira_tts.encode_audio("reference_file.wav")
|
||||||
|
|
||||||
|
# Stream and process chunks in real-time
|
||||||
|
for audio_chunk in mira_tts.generate_stream(text, context_tokens, chunk_size=50):
|
||||||
|
# audio_chunk is a torch tensor (48kHz)
|
||||||
|
# Process/play each chunk as it arrives
|
||||||
|
process(audio_chunk)
|
||||||
|
```
|
||||||
|
|
||||||
|
Or use the convenience method for immediate playback (requires `sounddevice`):
|
||||||
|
|
||||||
|
```python
|
||||||
|
# pip install sounddevice
|
||||||
|
mira_tts.generate_and_play(text, context_tokens, chunk_size=50)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Parameters:**
|
||||||
|
- `chunk_size`: Tokens per chunk (default 50 = ~1 sec audio). Lower = faster first chunk, higher = smoother audio.
|
||||||
|
|
||||||
|
**Performance:**
|
||||||
|
- First audio chunk: ~180ms (vs ~870ms for full generation)
|
||||||
|
- 390% faster time to first audio
|
||||||
|
|
||||||
|
Examples can be seen in the [huggingface model](https://huggingface.co/YatharthS/MiraTTS)
|
||||||
|
|
||||||
|
I recommend reading these 2 blogs to better easily understand LLM tts models and how I optimize them
|
||||||
|
- How they work: https://huggingface.co/blog/YatharthS/llm-tts-models
|
||||||
|
- How to optimize them: https://huggingface.co/blog/YatharthS/making-neutts-200x-realtime
|
||||||
|
|
||||||
|
## Training
|
||||||
|
Released training code! You can now train the model to be multilingual, multi-speaker, or support audio events on any local or cloud gpu!
|
||||||
|
|
||||||
|
Kaggle notebook: https://www.kaggle.com/code/yatharthsharma888/miratts-training
|
||||||
|
|
||||||
|
Colab notebook: https://colab.research.google.com/drive/1IprDyaMKaZrIvykMfNrxWFeuvj-DQPII?usp=sharing
|
||||||
|
|
||||||
|
## Next steps
|
||||||
|
- [x] Release code and model
|
||||||
|
- [x] Release training code
|
||||||
|
- [x] Support low latency streaming
|
||||||
|
- [ ] Release native 48khz bicodec
|
||||||
|
|
||||||
|
## Final notes
|
||||||
|
Thanks very much to the authors of Spark-TTS and unsloth. Thanks for checking out this repository as well.
|
||||||
|
|
||||||
|
Stars would be well appreciated, thank you.
|
||||||
|
|
||||||
|
Email: yatharthsharma3501@gmail.com
|
||||||
Binary file not shown.
@@ -0,0 +1 @@
|
|||||||
|
|
||||||
@@ -0,0 +1 @@
|
|||||||
|
|
||||||
Binary file not shown.
Binary file not shown.
Binary file not shown.
+209
@@ -0,0 +1,209 @@
|
|||||||
|
import gc
|
||||||
|
import re
|
||||||
|
import torch
|
||||||
|
from itertools import cycle
|
||||||
|
from ncodec.codec import TTSCodec
|
||||||
|
from lmdeploy import pipeline, GenerationConfig, TurbomindEngineConfig
|
||||||
|
|
||||||
|
from mira.utils import clear_cache, split_text
|
||||||
|
|
||||||
|
|
||||||
|
class MiraTTS:
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
model_dir="YatharthS/MiraTTS",
|
||||||
|
tp=1,
|
||||||
|
enable_prefix_caching=True,
|
||||||
|
cache_max_entry_count=0.2,
|
||||||
|
default_chunk_size=50,
|
||||||
|
):
|
||||||
|
|
||||||
|
backend_config = TurbomindEngineConfig(
|
||||||
|
cache_max_entry_count=cache_max_entry_count,
|
||||||
|
tp=tp,
|
||||||
|
dtype="bfloat16",
|
||||||
|
enable_prefix_caching=enable_prefix_caching,
|
||||||
|
)
|
||||||
|
self.pipe = pipeline(model_dir, backend_config=backend_config)
|
||||||
|
self.gen_config = GenerationConfig(
|
||||||
|
top_p=0.95,
|
||||||
|
top_k=50,
|
||||||
|
temperature=0.8,
|
||||||
|
max_new_tokens=1024,
|
||||||
|
repetition_penalty=1.2,
|
||||||
|
do_sample=True,
|
||||||
|
min_p=0.05,
|
||||||
|
)
|
||||||
|
self.codec = TTSCodec()
|
||||||
|
self.default_chunk_size = default_chunk_size
|
||||||
|
|
||||||
|
# Warm up decoder to reduce TTFA
|
||||||
|
self._decoder_warmed = False
|
||||||
|
|
||||||
|
def set_params(
|
||||||
|
self,
|
||||||
|
top_p=0.95,
|
||||||
|
top_k=50,
|
||||||
|
temperature=0.8,
|
||||||
|
max_new_tokens=1024,
|
||||||
|
repetition_penalty=1.2,
|
||||||
|
min_p=0.05,
|
||||||
|
):
|
||||||
|
"""sets sampling parameters for the llm"""
|
||||||
|
|
||||||
|
self.gen_config = GenerationConfig(
|
||||||
|
top_p=top_p,
|
||||||
|
top_k=top_k,
|
||||||
|
temperature=temperature,
|
||||||
|
max_new_tokens=max_new_tokens,
|
||||||
|
repetition_penalty=repetition_penalty,
|
||||||
|
min_p=min_p,
|
||||||
|
do_sample=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
def c_cache(self):
|
||||||
|
clear_cache()
|
||||||
|
|
||||||
|
def split_text(self, text):
|
||||||
|
return split_text(text)
|
||||||
|
|
||||||
|
def encode_audio(self, audio_file):
|
||||||
|
"""encodes audio into context tokens"""
|
||||||
|
|
||||||
|
context_tokens = self.codec.encode(audio_file)
|
||||||
|
return context_tokens
|
||||||
|
|
||||||
|
def warmup_decoder(self, context_tokens=None):
|
||||||
|
"""Warm up the decoder to reduce TTFA on first streaming chunk."""
|
||||||
|
if self._decoder_warmed:
|
||||||
|
return
|
||||||
|
|
||||||
|
if context_tokens:
|
||||||
|
dummy_tokens = "<|speech_token_0|><|speech_token_1|>"
|
||||||
|
_ = self.codec.decode_chunk(dummy_tokens, context_tokens)
|
||||||
|
else:
|
||||||
|
dummy_context = "".join([f"<|context_token_{i}|>" for i in range(10)])
|
||||||
|
dummy_tokens = "<|speech_token_0|><|speech_token_1|>"
|
||||||
|
_ = self.codec.decode_chunk(dummy_tokens, dummy_context)
|
||||||
|
|
||||||
|
self._decoder_warmed = True
|
||||||
|
|
||||||
|
def generate(self, text, context_tokens):
|
||||||
|
"""generates speech from input text"""
|
||||||
|
formatted_prompt = self.codec.format_prompt(text, context_tokens, None)
|
||||||
|
|
||||||
|
response = self.pipe(
|
||||||
|
[formatted_prompt], gen_config=self.gen_config, do_preprocess=False
|
||||||
|
)
|
||||||
|
audio = self.codec.decode(response[0].text, context_tokens)
|
||||||
|
return audio
|
||||||
|
|
||||||
|
def generate_stream(self, text, context_tokens, chunk_size=None):
|
||||||
|
"""
|
||||||
|
Generates speech from input text with streaming output.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Input text to synthesize
|
||||||
|
context_tokens: Reference audio context tokens
|
||||||
|
chunk_size: Number of tokens to decode before yielding audio (default from __init__ or 50 = ~1 sec at 20ms/token)
|
||||||
|
|
||||||
|
Yields:
|
||||||
|
Audio chunks as torch tensors (48kHz)
|
||||||
|
"""
|
||||||
|
if chunk_size is None:
|
||||||
|
chunk_size = self.default_chunk_size
|
||||||
|
|
||||||
|
self.warmup_decoder(context_tokens)
|
||||||
|
|
||||||
|
formatted_prompt = self.codec.format_prompt(text, context_tokens, None)
|
||||||
|
|
||||||
|
responses = self.pipe.stream_infer(
|
||||||
|
[formatted_prompt],
|
||||||
|
gen_config=self.gen_config,
|
||||||
|
do_preprocess=False,
|
||||||
|
stream_response=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
accumulated_tokens = []
|
||||||
|
|
||||||
|
for response in responses:
|
||||||
|
new_tokens = re.findall(r"speech_token_(\d+)", response.text)
|
||||||
|
accumulated_tokens.extend([int(t) for t in new_tokens])
|
||||||
|
|
||||||
|
if len(accumulated_tokens) >= chunk_size:
|
||||||
|
num_chunks = len(accumulated_tokens) // chunk_size
|
||||||
|
|
||||||
|
for i in range(num_chunks):
|
||||||
|
start_idx = i * chunk_size
|
||||||
|
end_idx = start_idx + chunk_size
|
||||||
|
chunk_tokens = accumulated_tokens[start_idx:end_idx]
|
||||||
|
|
||||||
|
token_str = "".join([f"<|speech_token_{t}|>" for t in chunk_tokens])
|
||||||
|
audio_chunk = self.codec.decode_chunk(token_str, context_tokens)
|
||||||
|
|
||||||
|
yield audio_chunk
|
||||||
|
|
||||||
|
accumulated_tokens = accumulated_tokens[end_idx:]
|
||||||
|
|
||||||
|
if response.finish_reason:
|
||||||
|
break
|
||||||
|
|
||||||
|
if accumulated_tokens:
|
||||||
|
token_str = "".join([f"<|speech_token_{t}|>" for t in accumulated_tokens])
|
||||||
|
audio_chunk = self.codec.decode_chunk(token_str, context_tokens)
|
||||||
|
yield audio_chunk
|
||||||
|
|
||||||
|
def batch_generate(self, prompts, context_tokens):
|
||||||
|
"""
|
||||||
|
Generates speech from text, for larger batch size
|
||||||
|
|
||||||
|
Args:
|
||||||
|
prompt (list): Input for tts model, list of prompts
|
||||||
|
voice (list): Description of voice, list of voices respective to prompt
|
||||||
|
"""
|
||||||
|
formatted_prompts = []
|
||||||
|
for prompt, context_token in zip(prompts, cycle(context_tokens)):
|
||||||
|
formatted_prompt = self.codec.format_prompt(prompt, context_token, None)
|
||||||
|
formatted_prompts.append(formatted_prompt)
|
||||||
|
|
||||||
|
responses = self.pipe(
|
||||||
|
formatted_prompts, gen_config=self.gen_config, do_preprocess=False
|
||||||
|
)
|
||||||
|
generated_tokens = [response.text for response in responses]
|
||||||
|
|
||||||
|
audios = []
|
||||||
|
for generated_token, context_token in zip(
|
||||||
|
generated_tokens, cycle(context_tokens)
|
||||||
|
):
|
||||||
|
audio = self.codec.decode(generated_token, context_token)
|
||||||
|
audios.append(audio)
|
||||||
|
audios = torch.cat(audios, dim=0)
|
||||||
|
|
||||||
|
return audios
|
||||||
|
|
||||||
|
def generate_and_play(
|
||||||
|
self, text, context_tokens, chunk_size=None, samplerate=48000
|
||||||
|
):
|
||||||
|
"""
|
||||||
|
Generates and plays audio in real-time using streaming.
|
||||||
|
Requires sounddevice: pip install sounddevice
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Input text to synthesize
|
||||||
|
context_tokens: Reference audio context tokens
|
||||||
|
chunk_size: Number of tokens per chunk (default from __init__ or 50 = ~1 sec)
|
||||||
|
samplerate: Audio sample rate (default 48000)
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
import sounddevice as sd
|
||||||
|
except ImportError:
|
||||||
|
raise ImportError(
|
||||||
|
"sounddevice required for playback. Install with: pip install sounddevice"
|
||||||
|
)
|
||||||
|
|
||||||
|
for audio_chunk in self.generate_stream(
|
||||||
|
text, context_tokens, chunk_size=chunk_size
|
||||||
|
):
|
||||||
|
sd.play(audio_chunk.cpu().numpy().flatten(), samplerate=samplerate)
|
||||||
|
|
||||||
|
sd.wait()
|
||||||
@@ -0,0 +1,11 @@
|
|||||||
|
import re
|
||||||
|
import gc
|
||||||
|
import torch
|
||||||
|
|
||||||
|
def split_text(text):
|
||||||
|
sentences = re.split(r'(?<=[.!?])\s+', text)
|
||||||
|
return sentences
|
||||||
|
|
||||||
|
def clear_cache():
|
||||||
|
gc.collect()
|
||||||
|
torch.cuda.empty_cache()
|
||||||
@@ -0,0 +1,30 @@
|
|||||||
|
[build-system]
|
||||||
|
requires = ["setuptools>=61.0", "wheel"]
|
||||||
|
build-backend = "setuptools.build_meta"
|
||||||
|
|
||||||
|
[project]
|
||||||
|
name = "FastNeuTTS"
|
||||||
|
version = "0.0.11"
|
||||||
|
authors = [
|
||||||
|
{ name="Yatharth Sharma", email="yatharthsharma3501@gmail.com" },
|
||||||
|
]
|
||||||
|
description = "High quality and Fast TTS with MiraTTS"
|
||||||
|
readme = "README.md"
|
||||||
|
requires-python = ">=3.10"
|
||||||
|
classifiers = [
|
||||||
|
"Programming Language :: Python :: 3",
|
||||||
|
"License :: OSI Approved :: MIT License",
|
||||||
|
"Operating System :: OS Independent",
|
||||||
|
]
|
||||||
|
dependencies = [
|
||||||
|
"lmdeploy",
|
||||||
|
"librosa",
|
||||||
|
"fastaudiosr @ git+https://github.com/ysharma3501/FlashSR.git",
|
||||||
|
"ncodec @ git+https://github.com/ysharma3501/FastBiCodec.git",
|
||||||
|
"einops",
|
||||||
|
"onnxruntime-gpu"
|
||||||
|
]
|
||||||
|
|
||||||
|
[project.urls]
|
||||||
|
Homepage = "https://github.com/ysharma3501/MiraTTS"
|
||||||
|
Issues = "https://github.com/ysharma3501/MiraTTS/issues"
|
||||||
Reference in New Issue
Block a user