Skip to main contentSkip to navigation

Microsoft MAI-Voice-1

Microsoft MAI-Voice-1 is a high-performance speech synthesis model that generates up to 60 seconds of natural-sounding audio in under one second using only a single GPU.