Machine Learning Engineer (Speech-to-Text & NLP)
Remote
Full-time
Mid-Senior Level
About UnityFlow AI
UnityFlow AI is advancing speech-to-text and natural language processing with Thunderscribe, our flagship B2B and B2C SaaS platform built to serve professionals and enterprises. We’re a growing, mission-driven startup with a commitment to developing innovative, language-inclusive AI solutions that transform how transcription and language understanding are leveraged across industries.
​
We’re looking for a Machine Learning Engineer skilled in speech recognition, large language models (LLMs), and retrieval-augmented generation (RAG). You’ll work closely with a team of engineers and researchers to design, optimize, and deploy cutting-edge machine learning models, driving product innovation and expanding Thunderscribe’s capabilities.
​
Responsibilities
​
-
Develop and deploy state-of-the-art ML models, focusing on STT, NLP, and LLM applications.
-
Design and implement efficient inference pipelines for low-latency, high-throughput performance in production environments.
-
Fine-tune large language models and STT models for optimal accuracy, supporting multiple languages and accents, with a focus on underrepresented languages.
-
Build and optimize retrieval-augmented generation (RAG) pipelines to enhance context-aware and dynamic response generation.
-
Conduct experiments and A/B tests to evaluate model updates, leveraging data-driven insights for continuous improvement.
-
Work with distributed computing frameworks and tools (e.g., Spark, Ray) to handle large datasets and complex training processes.
-
Integrate multi-modal models, leveraging both audio and text inputs to refine transcription accuracy and contextual understanding.
-
Collaborate with cross-functional teams to optimize model integration within the SaaS platform, ensuring seamless customer experience and scalability.
-
Implement robust monitoring and evaluation pipelines to assess model performance post-deployment, addressing drift, bias, and degradation proactively.
​​
Qualifications
​
-
3+ years of experience in machine learning with a focus on NLP, LLMs, and STT applications.
-
Strong proficiency in ML frameworks such as PyTorch, TensorFlow, Hugging Face Transformers, and specialized STT libraries like SpeechBrain and Wav2Vec.
-
Expertise in large language models, including fine-tuning and deployment of models like GPT, BERT, and T5.
-
Hands-on experience with retrieval-augmented generation (RAG) frameworks, optimizing for real-time and high-performance inference.
-
Familiarity with tools for data and model versioning, including MLFlow and DVC.
-
Proficient in GPU/TPU optimization, with experience leveraging NVIDIA GPUs and frameworks like TensorRT for low-latency model inference.
-
Deep understanding of model evaluation metrics (e.g., WER, BLEU, ROUGE) and best practices in A/B testing for ML model improvement.
-
Experience with cloud ML infrastructure (Azure, AWS, GCP) and deploying ML models via Docker, Kubernetes, or similar orchestration tools.
-
Proficiency in data engineering and processing pipelines, particularly for audio data (e.g., Librosa, Kaldi) and large-scale text corpora.
-
Advanced programming skills in Python, with experience in data handling (Pandas, NumPy) and proficiency in SQL and basic ETL workflows.
-
Strong understanding of neural network architectures, including Transformers, CNNs, RNNs, and attention mechanisms.
-
Familiarity with distributed training frameworks (Horovod, PyTorch Distributed) for large-scale model training and optimization.
​
Nice to have
​
-
Experience in multi-lingual NLP, particularly for South Asian, African, and emerging market languages.
-
Familiarity with speaker diarization tools, sentiment analysis, and multi-turn dialogue systems.
-
Understanding of model compression techniques, including pruning, quantization, and knowledge distillation.
-
Prior experience in scaling ML applications within a SaaS environment and working with RAG frameworks to enhance contextual language generation.
​
How to apply
If you’re passionate about machine learning and eager to shape the future of speech-to-text and language processing, we’d love to hear from you! Please fill out the Google Form at the link below:
​