SPEECH RESEARCH SCIENTIST (TEXT-TO-SPEECH)
ObEN’s mission is to enable everyone in the world to create their own Personal AI (PAI), intelligent 3D avatars that look, sound, and behave like the individual user. Secured and authenticated on the Project PAI blockchain, ObEN’s technology creates more productive, more personalized digital interactions. ObEN is a K11, Tencent, Softbank Ventures Korea and HTC Vive X portfolio company, and we work with our strategic investors to expand PAI technology across multiple verticals including hospitality, retail, healthcare, and entertainment.
Working at ObEN means taking on extraordinary transformations every day, in an environment that celebrates and encourages innovation. You’ll be working in small, agile teams (including world class researchers in areas of speech, computer vision, machine learning, NLP, and blockchain). We are blazing new trails in AI and blockchain technology, and we encourage and support publications to top conferences and journals. Learn more about working at ObEN in our blog post.
As a Speech Research Scientist specialized in Text-to-Speech, you will be working on developing cutting-edge deep learning algorithms for voice personalization. This will include the development of structured acoustic models for synthesis allowing the control of factors such as voice timbre, voice quality, language, accent, expressiveness and speaking style and the adaptation/conversion towards a target voice using a reduced amount of data.
- Develop and extend ObEN’s proprietary TTS system, in view of improving the quality and the naturalness of the synthesized voice as well as the similarity to the target voice and reducing the amount of data for speaker adaptation;
- Develop deep generative model of raw speech waveform;
- Develop cross-lingual approaches (e.g. phonetic posteriorgrams)
- PhD with strong research experience in Adaptation of DNN-based TTS systems demonstrated by publications in top Speech journals and conferences (Icassp, Interspeech, etc);
- Strong machine learning background and familiar with standard statistical modeling techniques applied to speech;
- Research experience in deep generative model of raw audio (wavenet) and Generative Adversarial Network (WGAN);
- Fluent in Python and C++, and expert knowledge of deep learning packages (TensorFlow, Theano, Keras, etc);
- Familiarity with linguistic phonetics;
- Knowledge of basic digital signal processing techniques for audio.
- Please send the following to firstname.lastname@example.org
- Detailed resume and/or LinkedIn profile
- Links to any research / papers you have been an instrumental part of and are proud of
- Name of instructor / adviser, if any along with link to their profile
- Cover Letter identifying your five favorite apps on your phone
- Introduction to ObEN: https://goo.gl/gxpxwT
STAGE 1: Phone Interview
STAGE 2: In-person Interview at Idealab (we cover travel expenses for the day)
STAGE 3: We require a sample project submission and a candidate proposal submission(To know more about what an ObEN candidate proposal is, click here)
STAGE 4: Spend a day at our office and participate in all team activities.
STAGE 5: Offer Letter
Not ready to apply for this job? Sign-up to receive ObEN job alerts.