Key Expertise
Experience
7+ years
Timezone
CET (GMT +1)
Skills
AI / ML
Languages
Databases
Infrastructure
Frameworks
Integrations & Protocols
Overview
Knowledge base system for medical device documentation with semantic search capabilities. Pipeline: Web scraping of manufacturer manuals for specified medical devices → chunking → indexing with metadata → storage in vector database. Core Functionality: On query, retrieves relevant documentation and specifications for a given medical device.
Achievements
• Built a comprehensive knowledge base covering manufacturer manuals and specifications for medical devices. • Achieved high retrieval accuracy through optimized chunking strategies and metadata enrichment. • Reduced query response time to sub-second levels through embedding model selection and vector store optimization (as mixed search via cosine similarity and metadata usage).
Responsibilities
- Designed and implemented web scraping pipelines to collect manufacturer manuals and device documentation.
- Developed chunking and indexing strategies with metadata tagging for accurate retrieval.
- Configured ChromaDB as vector store with metadata filtering for device-specific queries.
- Integrated all-MiniLM-L6-v2 embedding model for semantic search capabilities.
- Built RAG pipeline using LangChain with Llama 2 as the generation model.
- Set up distributed processing with Ray.io for scalable document ingestion.
Technologies Used
Key Expertise
Experience
7+ years
Timezone
CET (GMT +1)
Skills
AI / ML
Languages
Databases
Infrastructure
Frameworks
Integrations & Protocols
This project was delivered by
Mykhailo Z.
More Projects by Mykhailo Z.
Interrogation Transcription System for Law Enforcement
Voice AI Engineer
Automated real-time transcription of interviews to generate official protocols in a secure environment. On-premise (air-gapped) deployment ensuring maximum security and data privacy. Core Model: Python, OpenAI Whisper, Pyannote, Docker, on-premise deployment Orchestration: Custom system for real-time processing (voice detection + chunking + transcription). Supports up to 10 simultaneous sessions. Fine-tuning Pipeline: Created a pipeline for periodic model updates using client-provided datasets (edited transcripts). Focused on adapting to (local dialect) and low-quality audio. Metrics: Used WER (Word Error Rate) and CER (Character Error Rate) to validate model performance. Deployment: On-premise (Air-gapped). All components are deployed locally to ensure maximum security and data privacy.
Financial Voice Agent for Call Center
Voice AI Engineer
Voice agent integration for a financial services company with a focus on mobile stability. Focus: Integrated AI agents with telephony infrastructure. Solved architectural challenges regarding vendor integrations. Performance: Focused on maintaining high communication quality over mobile networks.
Ready to Build Your AI Team?
Get matched with the right AI experts for your project. Book a free discovery call to discuss your requirements.
We respond within 24 hours.