Yurii K.
LLM Engineer / GenAI Engineer
Enterprise-grade LLM solutions, RAG pipelines, multi-agent orchestration, and AI-driven automation. Production experience across fintech, healthcare, DevOps, and creative industries.
Key Expertise
Certifications
Google Cloud ML Engineer
2023
AWS ML Specialty
2022
Experience
7+ years
Timezone
CET (GMT +1)
Skills
AI / ML
Languages
Databases
Infrastructure
Frameworks
Integrations & Protocols
1. AI-Powered First-Line Support Agent (RAG)
Project overview:
The project involved developing an intelligent automation system to handle Tier-1 customer inquiries for a microsite building platform. By leveraging advanced Retrieval-Augmented Generation (RAG), the system analyzes a knowledge base of over 280 articles to provide instant and accurate responses. The solution was designed to solve scalability challenges by automating repetitive, low-complexity questions that previously overwhelmed the human support team.
Responsibilities:
- Designed a sophisticated state-based graph architecture to route queries, grade document relevance, and manage multi-step support workflows.
- Implemented a hybrid semantic search layer featuring query expansion and cross-encoder reranking to ensure precise information retrieval.
- Developed an automated \"watchdog\" infrastructure for incremental indexing, allowing the system to self-update whenever documentation changes.
- Integrated built-in validation mechanisms and source attribution to ensure all AI responses are grounded in official documentation.
Achievements:
The system achieved a 60–80% reduction in routine ticket volume and provided a positive ROI within 90 days of implementation. It enabled 24/7 support availability with sub-second response times, significantly improving customer satisfaction while maintaining high-fidelity accuracy through automated hallucination detection.
Technology stack:
2. Multi-Agent Personal Assistant & Time Organizer
Project overview:
The project involved building a sophisticated personal productivity automation system based on a Multi-Agent Architecture to streamline tasks and professional collaboration. The system integrates with Google Workspace, Jira, Zoom, and Confluence to provide unified task management, automated scheduling, and intelligent email triage. By using a Supervisor Agent to orchestrate specialized sub-agents, the platform autonomously handles complex workflows like research and meeting coordination while maintaining human-in-the-loop security.
Responsibilities:
- Architected a multi-agent graph system using LangGraph, featuring a central Supervisor Agent that intelligently delegates requests to specialized agents for tasks, calendar management, and research.
- Developed a bidirectional synchronization engine between Google Tasks and Jira to provide a unified priority view and eliminate manual task duplication across personal and professional platforms.
- Implemented an advanced "LongMemory" system utilizing recursive summarization and context pruning to maintain high-fidelity context across extended sessions without exceeding token limits.
- Designed a secure "Human-in-the-Loop" validation layer for critical email communications, ensuring that high-priority responses require explicit approval while routine drafts are handled autonomously.
- Integrated comprehensive analytics and ROI tracking to monitor real-time message volume, per-agent token consumption, and "estimated time saved" metrics.
Achievements:
Successfully reduced operational costs and LLM token consumption by 40–60% through a custom multi-layer caching infrastructure. The system reclaimed an estimated 10–15 hours per week for users by automating routine coordination and research tasks. Additionally, the implementation achieved sub-second response times for cached queries, significantly improving the user experience and decision-making speed.
Technology stack:
3. AI-Powered Creative Automation System
Project overview:
Creative teams lose days on manual marketing asset production, with inconsistent brand use, multi-platform resizing, and weak product-identity control. The system automates the pipeline from brief to delivery: it uses Computer Vision for design analysis, automated prompt engineering, fine-tuned generative models, smart one-to-many resizing (Saliency Maps), and ControlNet for product fidelity, with automated brand and compliance checks—cutting production from days to minutes while keeping full brand consistency.
Responsibilities:
- Developed a Computer Vision layer for style recognition, pattern extraction, and compositional analysis of existing brand assets.
- Designed a proprietary fine-tuning pipeline (Full-parameter tuning) for brand-specific generative models and implemented structural conditioning (ControlNet/IP-Adapters).
- Built a smart resizing module using Saliency Maps to preserve focal points and ensure platform-specific safe zone compliance.
- Created an automated multi-level validation system for prompt optimization and brand safety (negative prompting, logo integrity, and legal compliance).
Achievements:
The system successfully automated multi-platform asset generation (Instagram, YouTube, Display Ads) while maintaining 100% brand consistency. It implemented automated Delta E color accuracy checks and structural conditioning to ensure pixel-perfect product fidelity. The solution scaled asset production significantly, allowing for high-volume batch processing without increasing manual labor costs.
Technology stack:
4. AI-Powered Natural Language Interface for DevOps Monitoring
Project overview:
Telecom providers and financial institutions needed a solution to detect synthetic speech and protect against voice fraud in call centers. We built a real-time deepfake detection system for streaming audio using speaker recognition, diarization, ASR, Deep Fake detector model. Management needs monitoring data for decisions but has no SQL skills, so they depend on DevOps for every request and wait 1–2 hours while engineers spend 20–30% of their time on routine data pulls. An AI assistant in Slack lets users ask questions in plain English, turns them into read-only database queries with full audit trails, and escalates when unsure—cutting DevOps time on data requests by 60–80% and giving near-instant, self-service access to system performance, costs, and errors.
Responsibilities:
- Designed a multi-agent orchestration logic (Router, SQL, CSV, and Security agents) to handle complex user intents.
- Developed a security validation engine to prevent unauthorized database operations and mask sensitive information.
- Implemented context-aware conversation management, allowing users to ask follow-up questions naturally.
- Built a smart caching and retrieval system to reuse query results, optimizing both performance and API costs.
- Integrated the solution with Slack API and enterprise production databases, ensuring seamless adoption into existing workflows.
Achievements:
The system was successfully deployed into production, reducing the DevOps team's workload on routine data requests by 60–80% and cutting response times from hours to under 30 seconds. By achieving an 80%+ self-service rate for management, the solution ensured 24/7 data access while maintaining 100% security compliance through automated audit trails and read-only enforcement. Additionally, the implementation of intelligent caching and multi-agent routing reduced AI operational costs by 40–60%.
Technology stack:
5. AI-Powered Credit Card Competitive Intelligence System
Project overview:
Financial institutions were manually monitoring 500+ pages across 15+ comparison sites every day, and traditional scrapers kept breaking whenever site layouts changed. The solution is a self-healing AI platform that uses semantic HTML analysis (81.82% accuracy), adaptive parsing (98% success rate), and automated analytics to collect and analyze competitive data, adapt to site changes in hours, and turn raw data into trend analysis and strategic recommendations.
Responsibilities:
- Designed a self-healing AI architecture with five-layer error prevention (proxy rotation, schema drift detection, and LLM-based fallback).
- Integrated OpenAI API for semantic HTML understanding and multi-dimensional trend analysis (seasonality, anomaly detection).
- Developed a modular data pipeline for extracting 15+ financial attributes with intelligent normalization and deduplication.
- Built an automated insight generation system that transforms raw market data into natural language strategic recommendations.
- Implemented CI/CD for scraping configurations with AI-generated suggestions for rapid adaptation to site changes.
Achievements:
The system achieved 81.82% accuracy in semantic HTML analysis (compared to 63.58% using traditional methods) and maintained a 98% extraction success rate across 500+ daily pages. By implementing an AI-driven self-healing engine, the Mean Time to Repair (MTTR) was slashed from days to minutes, while 40–60 hours of manual data collection were automated weekly per analyst, resulting in a 60–70% reduction in labor costs. Furthermore, the platform reached over 95% schema coverage and integrated automated quality scoring for every extracted financial attribute, ensuring high-fidelity data for strategic decision-making.
Technology stack:
Key Expertise
Certifications
Google Cloud ML Engineer
2023
AWS ML Specialty
2022
Experience
7+ years
Timezone
CET (GMT +1)
Skills
AI / ML
Languages
Databases
Infrastructure
Frameworks
Integrations & Protocols
Ready to Work with Yurii K.?
LLM Engineer / GenAI Engineer
Share your project details and our team will review the match and confirm availability.
We respond within 24 hours.