ml engineer | UnityflowAl

Machine Learning Engineer (Speech-to-Text & NLP)

Remote

Full-time

Mid-Senior Level

About UnityFlow AI

UnityFlow AI is advancing speech-to-text and natural language processing with Thunderscribe, our flagship B2B and B2C SaaS platform built to serve professionals and enterprises. We’re a growing, mission-driven startup with a commitment to developing innovative, language-inclusive AI solutions that transform how transcription and language understanding are leveraged across industries.

We’re looking for a Machine Learning Engineer skilled in speech recognition, large language models (LLMs), and retrieval-augmented generation (RAG). You’ll work closely with a team of engineers and researchers to design, optimize, and deploy cutting-edge machine learning models, driving product innovation and expanding Thunderscribe’s capabilities.

Responsibilities

Develop and deploy state-of-the-art ML models, focusing on STT, NLP, and LLM applications.
Design and implement efficient inference pipelines for low-latency, high-throughput performance in production environments.
Fine-tune large language models and STT models for optimal accuracy, supporting multiple languages and accents, with a focus on underrepresented languages.
Build and optimize retrieval-augmented generation (RAG) pipelines to enhance context-aware and dynamic response generation.
Conduct experiments and A/B tests to evaluate model updates, leveraging data-driven insights for continuous improvement.
Work with distributed computing frameworks and tools (e.g., Spark, Ray) to handle large datasets and complex training processes.
Integrate multi-modal models, leveraging both audio and text inputs to refine transcription accuracy and contextual understanding.
Collaborate with cross-functional teams to optimize model integration within the SaaS platform, ensuring seamless customer experience and scalability.
Implement robust monitoring and evaluation pipelines to assess model performance post-deployment, addressing drift, bias, and degradation proactively.

Qualifications

3+ years of experience in machine learning with a focus on NLP, LLMs, and STT applications.
Strong proficiency in ML frameworks such as PyTorch, TensorFlow, Hugging Face Transformers, and specialized STT libraries like SpeechBrain and Wav2Vec.
Expertise in large language models, including fine-tuning and deployment of models like GPT, BERT, and T5.
Hands-on experience with retrieval-augmented generation (RAG) frameworks, optimizing for real-time and high-performance inference.
Familiarity with tools for data and model versioning, including MLFlow and DVC.
Proficient in GPU/TPU optimization, with experience leveraging NVIDIA GPUs and frameworks like TensorRT for low-latency model inference.
Deep understanding of model evaluation metrics (e.g., WER, BLEU, ROUGE) and best practices in A/B testing for ML model improvement.
Experience with cloud ML infrastructure (Azure, AWS, GCP) and deploying ML models via Docker, Kubernetes, or similar orchestration tools.
Proficiency in data engineering and processing pipelines, particularly for audio data (e.g., Librosa, Kaldi) and large-scale text corpora.
Advanced programming skills in Python, with experience in data handling (Pandas, NumPy) and proficiency in SQL and basic ETL workflows.
Strong understanding of neural network architectures, including Transformers, CNNs, RNNs, and attention mechanisms.
Familiarity with distributed training frameworks (Horovod, PyTorch Distributed) for large-scale model training and optimization.

Nice to have

Experience in multi-lingual NLP, particularly for South Asian, African, and emerging market languages.
Familiarity with speaker diarization tools, sentiment analysis, and multi-turn dialogue systems.
Understanding of model compression techniques, including pruning, quantization, and knowledge distillation.
Prior experience in scaling ML applications within a SaaS environment and working with RAG frameworks to enhance contextual language generation.

How to apply

If you’re passionate about machine learning and eager to shape the future of speech-to-text and language processing, we’d love to hear from you! Please fill out the Google Form at the link below:

Google Form (Click here)