Overview
Knowledge base system for medical device documentation with semantic search capabilities. Pipeline: Web scraping of manufacturer manuals for specified medical devices → chunking → indexing with metadata → storage in vector database. Core Functionality: On query, retrieves relevant documentation and specifications for a given medical device.
Achievements
• Built a comprehensive knowledge base covering manufacturer manuals and specifications for medical devices. • Achieved high retrieval accuracy through optimized chunking strategies and metadata enrichment. • Reduced query response time to sub-second levels through embedding model selection and vector store optimization (as mixed search via cosine similarity and metadata usage).
Responsibilities
- Designed and implemented web scraping pipelines to collect manufacturer manuals and device documentation.
- Developed chunking and indexing strategies with metadata tagging for accurate retrieval.
- Configured ChromaDB as vector store with metadata filtering for device-specific queries.
- Integrated all-MiniLM-L6-v2 embedding model for semantic search capabilities.
- Built RAG pipeline using LangChain with Llama 2 as the generation model.
- Set up distributed processing with Ray.io for scalable document ingestion.
This project was delivered by
Mykhailo Z.
More Projects by Mykhailo Z.
Interrogation Transcription System for Law Enforcement
Voice AI Engineer
Automated real-time transcription of interviews to generate official protocols in a secure environment. On-premise (air-gapped) deployment ensuring maximum security and data privacy. Core Model: Python, OpenAI Whisper, Pyannote, Docker, on-premise deployment Orchestration: Custom system for real-time processing (voice detection + chunking + transcription). Supports up to 10 simultaneous sessions. Fine-tuning Pipeline: Created a pipeline for periodic model updates using client-provided datasets (edited transcripts). Focused on adapting to (local dialect) and low-quality audio. Metrics: Used WER (Word Error Rate) and CER (Character Error Rate) to validate model performance. Deployment: On-premise (Air-gapped). All components are deployed locally to ensure maximum security and data privacy.
Financial Voice Agent for Call Center
Voice AI Engineer
Voice agent integration for a financial services company with a focus on mobile stability. Focus: Integrated AI agents with telephony infrastructure. Solved architectural challenges regarding vendor integrations. Performance: Focused on maintaining high communication quality over mobile networks.
Ready to Build Your AI Team?
Get matched with the right AI experts for your project. Book a free discovery call to discuss your requirements.
No commitment required. We respond within 24 hours.