top of page

Machine Learning Engineer (Speech-to-Text & NLP)

Remote

Full-time 

Mid-Senior Level

About UnityFlow AI


UnityFlow AI is advancing speech-to-text and natural language processing with Thunderscribe, our flagship B2B and B2C SaaS platform built to serve professionals and enterprises. We’re a growing, mission-driven startup with a commitment to developing innovative, language-inclusive AI solutions that transform how transcription and language understanding are leveraged across industries.

​

We’re looking for a Machine Learning Engineer skilled in speech recognition, large language models (LLMs), and retrieval-augmented generation (RAG). You’ll work closely with a team of engineers and researchers to design, optimize, and deploy cutting-edge machine learning models, driving product innovation and expanding Thunderscribe’s capabilities.

​

Responsibilities

​

  • Develop and deploy state-of-the-art ML models, focusing on STT, NLP, and LLM applications.

  • Design and implement efficient inference pipelines for low-latency, high-throughput performance in production environments.

  • Fine-tune large language models and STT models for optimal accuracy, supporting multiple languages and accents, with a focus on underrepresented languages.

  • Build and optimize retrieval-augmented generation (RAG) pipelines to enhance context-aware and dynamic response generation.

  • Conduct experiments and A/B tests to evaluate model updates, leveraging data-driven insights for continuous improvement.

  • Work with distributed computing frameworks and tools (e.g., Spark, Ray) to handle large datasets and complex training processes.

  • Integrate multi-modal models, leveraging both audio and text inputs to refine transcription accuracy and contextual understanding.

  • Collaborate with cross-functional teams to optimize model integration within the SaaS platform, ensuring seamless customer experience and scalability.

  • Implement robust monitoring and evaluation pipelines to assess model performance post-deployment, addressing drift, bias, and degradation proactively.

​​

Qualifications

​

  • 3+ years of experience in machine learning with a focus on NLP, LLMs, and STT applications.

  • Strong proficiency in ML frameworks such as PyTorch, TensorFlow, Hugging Face Transformers, and specialized STT libraries like SpeechBrain and Wav2Vec.

  • Expertise in large language models, including fine-tuning and deployment of models like GPT, BERT, and T5.

  • Hands-on experience with retrieval-augmented generation (RAG) frameworks, optimizing for real-time and high-performance inference.

  • Familiarity with tools for data and model versioning, including MLFlow and DVC.

  • Proficient in GPU/TPU optimization, with experience leveraging NVIDIA GPUs and frameworks like TensorRT for low-latency model inference.

  • Deep understanding of model evaluation metrics (e.g., WER, BLEU, ROUGE) and best practices in A/B testing for ML model improvement.

  • Experience with cloud ML infrastructure (Azure, AWS, GCP) and deploying ML models via Docker, Kubernetes, or similar orchestration tools.

  • Proficiency in data engineering and processing pipelines, particularly for audio data (e.g., Librosa, Kaldi) and large-scale text corpora.

  • Advanced programming skills in Python, with experience in data handling (Pandas, NumPy) and proficiency in SQL and basic ETL workflows.

  • Strong understanding of neural network architectures, including Transformers, CNNs, RNNs, and attention mechanisms.

  • Familiarity with distributed training frameworks (Horovod, PyTorch Distributed) for large-scale model training and optimization.

​

Nice to have

​

  • Experience in multi-lingual NLP, particularly for South Asian, African, and emerging market languages.

  • Familiarity with speaker diarization tools, sentiment analysis, and multi-turn dialogue systems.

  • Understanding of model compression techniques, including pruning, quantization, and knowledge distillation.

  • Prior experience in scaling ML applications within a SaaS environment and working with RAG frameworks to enhance contextual language generation.

​

How to apply


If you’re passionate about machine learning and eager to shape the future of speech-to-text and language processing, we’d love to hear from you! Please fill out the Google Form at the link below:

​

Google Form (Click here)

bottom of page