Portfolio

SIGMOID AI

이노디스AXinnodisAX

생성형 LLM과 Semantic Search를 결합한 RAG 기반 AI 검색 시스템 설계 및 개발Design and development of a RAG-based AI search system combining generative LLMs with semantic search

DjangoLangChainLangGraphRAGElasticsearchOpensearchDockerMySQLPython

프로젝트 소개Overview
도메인 문서, 로그, 구조화/비구조화 데이터를 통합하여 정확도 중심의 검색 품질 향상을 목표Aimed at accuracy-focused search quality by integrating domain documents, logs, and structured/unstructured data

담당 업무Responsibilities

범용 파일 인덱싱 시스템 개발General-purpose file indexing system
- 텍스트, 수식, 도표, 이미지가 포함된 PDF 지원
- VLM 기반 문서 파싱 후 Markdown 형태로 구조화
- Support for PDFs containing text, formulas, tables, and images
- VLM-based document parsing structured into Markdown
Semantic Search 파이프라인 설계Semantic search pipeline design
- Dense Embedding 기반 Retrieval 구조 설계
- 도메인 특성에 맞춘 Chunking 전략 및 문서 전처리 파이프라인 구현
- Vector DB 기반 Top-K 후보군 추출 및 검색 latency 최적화
- Dense-embedding-based retrieval architecture
- Domain-tailored chunking strategy and document preprocessing pipeline
- Vector-DB Top-K candidate retrieval and search latency optimization
Agent 기반 RAG 시스템Agent-based RAG system
- LangChain, LangGraph 기반 Agent 설계
- Function Calling을 활용한 DB 검색 → Reranker → 응답 생성 파이프라인 구현
- Agent design with LangChain and LangGraph
- DB search → reranker → response generation pipeline via function calling
Multi-Agent 및 페르소나 기반 일관성 유지Multi-agent and persona-based consistency
- 역할별 페르소나를 정의한 Multi-Agent 구조로 태스크 분리
- 페르소나 기반 프롬프트로 응답 톤·역할의 일관성 유지
- Task separation via a multi-agent architecture with role-specific personas
- Persona-based prompting to keep response tone and role consistent
Hallucination 감소 전략Hallucination reduction strategy
- 응답 생성 과정에 reasoning(thinking) 단계 추가
- Context 검증 로직을 통한 응답 신뢰도 향상
- Added a reasoning (thinking) step to response generation
- Improved response reliability through context verification logic
대용량 데이터 처리Large-scale data processing
- 문서/쿼리 로그 기반 데이터셋 구축
- 배치 임베딩 파이프라인 및 재색인 자동화
- Dataset built from document/query logs
- Batch embedding pipeline and automated re-indexing
배포 후 성능 모니터링 및 Drift 대응Post-deployment monitoring and drift mitigation
- 배포 이후 검색 품질·응답 성능을 지속 측정하는 모니터링 체계 구축
- 시간 경과에 따른 성능 저하(drift) 탐지 및 재색인·튜닝으로 대응
- Continuous monitoring of search quality and response performance after deployment
- Detection of performance drift over time, mitigated via re-indexing and tuning
검색 엔진 마이그레이션 (Elasticsearch → OpenSearch)Search engine migration (Elasticsearch → OpenSearch)
- Apache 2.0 라이선스 마지막 버전인 Elasticsearch 7.10.2에서 최신 OpenSearch 3.x로 전환
- 인덱스·매핑 이관 및 쿼리·클라이언트 호환성 대응
- Migrated from Elasticsearch 7.10.2, the last Apache-2.0-licensed release, to the latest OpenSearch 3.x
- Index/mapping migration and query/client compatibility work
성과Impact
- 고객사 DB 연결 없이 범용 파일 인덱싱 실현, 다국어 입출력 지원
- 최근 3개월 평균 3.8k 사용자
- General-purpose file indexing without connecting to customer DBs; multilingual I/O
- Avg. 3.8k users over the last 3 months

SIGMOID LLM Manager

이노디스AXinnodisAX

도메인·상태 기반 LLM API 라우팅 및 서빙 최적화 시스템Domain- and status-based LLM API routing and serving optimization system

FastAPIPythonPrometheusGrafana

프로젝트 소개Overview
요청 도메인, 트래픽 상태, 모델 성능을 고려하여 최적의 LLM을 동적으로 선택하는 시스템A system that dynamically selects the optimal LLM based on request domain, traffic state, and model performance

담당 업무Responsibilities

LLM 라우팅 로직 설계LLM routing logic design
- 도메인별 모델 정책 정의 (예: 검색 특화 / 요약 특화 / 비용 최적화)
- 모델 상태(응답 지연, 오류율)에 따른 Failover 및 우선순위 라우팅 구현
- Per-domain model policies (e.g., search-tuned / summarization-tuned / cost-optimized)
- Failover and priority routing based on model status (latency, error rate)
서빙 안정성 강화Serving stability hardening
- 모델 헬스 체크 및 상태 모니터링 로직 구현
- 장애 상황에서 자동 대체 모델 전환 구조 설계
- API 통계 및 시각화
- 프로메테우스 및 그라파나를 이용한 통계 및 시각화
- Model health checks and status monitoring
- Automatic fallback model switching on failure
- API statistics and visualization
- Statistics and visualization with Prometheus and Grafana
성과Impact
- 동시 요청 100건 0.1초 내 처리
- 월 평균 3만 건 API 처리
- 100 concurrent requests handled within 0.1s
- ~30k API calls/month

Code2Chart

슈어소프트테크Suresofttech

C 언어 기반 Mermaid chart 생성 LLMAn LLM that generates Mermaid charts from C code

Python 3.11PyTorch 2.4.0CUDA 12.1TransformersFlashAttentionFlashInferDeepspeedFSDPNVIDIA H100DockerWandBMLflowDVCvLLMsgLangLLama-factoryLinux

프로젝트 소개Overview
C 언어 코드의 함수 구조 및 로직을 분석하여 Mermaid 문법의 Flowchart로 변환하는 도메인 특화 sLLM 개발A domain-specific sLLM that analyzes C function structure and logic and converts it into Mermaid-syntax flowcharts

담당 업무Responsibilities

LLM 파인튜닝 파이프라인 구축LLM fine-tuning pipeline
- llama 3.1-8B 모델 기반
- Continual Pretraining(CPT), Instruction Tuning, Supervised Fine-tuning(SFT) 단계별 수행
- FFT, LoRA, QLoRA 등 SFT 기법 비교 및 CoT 데이터 활용
- 양자화(Quantization) 기반 경량 파인튜닝으로 GPU 메모리 절감 및 학습·추론 효율화
- CUDA, PyTorch, Transformers 기반 학습 환경 구성
- Based on the Llama 3.1-8B model
- Staged Continual Pretraining (CPT), Instruction Tuning, and Supervised Fine-tuning (SFT)
- Compared SFT methods (FFT, LoRA, QLoRA) and used Chain-of-Thought (CoT) data
- Quantization-based lightweight fine-tuning to cut GPU memory and speed up training/inference
- Training environment built on CUDA, PyTorch, and Transformers
분산 학습 환경 구성Distributed training setup
- H100 80GB 8장 단일 노드 환경에서 학습·튜닝 수행
- FSDP, DeepSpeed ZeRO 등 파라미터·옵티마이저 샤딩 기법으로 단일 GPU 메모리 한계 극복
- LLaMA-Factory 기반 학습 구성에서 Mixed Precision, Gradient Checkpointing과 FlashAttention · FlashInfer 커널 백엔드를 적용해 처리량과 메모리 사용량 조율
- Training and tuning on a single node with 8× H100 80GB
- Parameter/optimizer sharding via FSDP and DeepSpeed ZeRO to get past single-GPU memory limits
- Mixed precision, gradient checkpointing, and FlashAttention / FlashInfer kernel backends applied within a LLaMA-Factory training setup to balance throughput and memory
데이터 및 실험 관리Data and experiment management
- WandB, MLflow, DVC를 활용한 데이터·모델·실험 이력 관리
- 재현 가능한 학습 파이프라인 구축
- Data, model, and experiment tracking with WandB, MLflow, and DVC
- Reproducible training pipeline
정량 성능 개선Quantitative performance improvement
- 자체 구축 벤치마크 기준
- Base Model 33점 → Fine-tuned Model 67점으로 성능 향상
- Measured on an in-house benchmark
- Base model 33 → fine-tuned model 67
모델 서빙 및 배포Model serving and deployment
- sgLang 기반 LLM 서빙 환경 구성
- vLLM 기반 Prefill / Decode 분리(P/D Disaggregation) 서빙 구성
- Prefill과 Decode를 별도 인스턴스로 분리해 긴 입력의 Prefill이 Decode 지연을 밀어내지 않도록 구성
- KV 캐시 전송 경로 구성 및 인스턴스별 병렬도·배치 정책 분리 조정
- 사내 배포 및 QA를 통한 실사용 검증
- LLM serving environment built on sgLang
- vLLM-based prefill/decode disaggregated (P/D disaggregation) serving
- Prefill and decode split into separate instances so long-prompt prefill no longer stalls decode latency
- KV-cache transfer path plus per-instance parallelism and batching policies tuned separately
- Validated in real use via internal deployment and QA

sLLM Benchmark

슈어소프트테크Suresofttech

도메인 특화 sLLM 정량 평가를 위한 LLM-as-a-Judge 벤치마크An LLM-as-a-Judge benchmark for quantitatively evaluating domain-specific sLLMs

Python 3.11TransformersPyTorchCUDALinux

프로젝트 소개Overview
도메인 특화 sLLM 정량 평가를 위한 LLM-as-a-Judge 벤치마크An LLM-as-a-Judge benchmark for quantitatively evaluating domain-specific sLLMs

담당 업무Responsibilities

평가 프레임워크 설계Evaluation framework design
- Chain-of-Thought 기반 평가 템플릿 설계
- Mermaid Flowchart 품질 평가를 위한 10개 세부 평가 항목 정의
- Chain-of-Thought-based evaluation templates
- 10 detailed criteria for assessing Mermaid flowchart quality
자동화된 평가 시스템 구축Automated evaluation system
- LLM-as-a-Judge 방식으로 대규모 평가 자동화
- 모델 반복 개선에 적합한 평가 파이프라인 구현
- Large-scale evaluation automated via LLM-as-a-Judge
- Evaluation pipeline suited to iterative model improvement
신뢰도 검증Reliability validation
- Ground Truth 데이터 100개 구축
- 평가 결과 신뢰도 89.5% 달성
- Built 100 ground-truth samples
- Achieved 89.5% evaluation reliability

Distributed Human Activity Recognition System for Scalable Wi-Fi Sensing

INC Lab.INC Lab.

확장 가능한 Wi-Fi 센싱을 위한 분산형 인체 활동 인식 시스템A distributed human activity recognition system for scalable Wi-Fi sensing

Python 3.11CUDALinuxViTTransformersPyTorchRaspberry Pi 4BNexmon CSI

프로젝트 소개Overview
기존 Wi-Fi based Human Activity Recognition (HAR) 연구의 취약점 보완Addresses the weaknesses of existing Wi-Fi-based Human Activity Recognition (HAR) research

담당 업무Responsibilities

CSI 스펙트로그램을 입력으로 하는 ViT(Vision Transformer) 기반 행동 분류 모델 설계Designed a ViT (Vision Transformer)-based activity-classification model over CSI spectrograms
기존 대비 데이터 사용량 1/N로 감소Reduced data usage to 1/N of prior approaches
분산 센싱 시스템 적용 및 센싱 데이터 랜덤 셔플링을 통한 도청 데이터 효용성 저하Distributed sensing with random shuffling of sensing data to degrade the usefulness of eavesdropped data
결과적으로 1/N 데이터로 90% 행동 분석 정확도 달성Achieved 90% activity-recognition accuracy using only 1/N of the data
도청자 모델 정확도 평균 30% 미만 달성으로 도청 방지Kept eavesdropper model accuracy below ~30% on average, preventing eavesdropping

연구 논문 PDFResearch Paper PDF →

GuardianWatch

졸업작품Graduation Project

다중 객체 추적 기술을 활용한 어린이집 안전 모니터링 시스템A daycare safety monitoring system using multi-object tracking

Python 3.11FlaskWatchDogMySQLCUDALinuxJavaAndroidBluetooth

프로젝트 소개Overview
비전 AI 기반 어린이집 CCTV 분석 및 안전 모니터링 시스템 — 어린이집 CCTV 영상을 분석하여 아동의 위치, 행동, 활동량 정보를 추출하고, 학부모 앱에 알림 및 분석 정보를 제공하는 비전 AI 프로젝트A vision-AI daycare CCTV analysis and safety monitoring system — it analyzes daycare CCTV footage to extract children's location, behavior, and activity levels, and delivers alerts and analytics to a parent app

담당 업무Responsibilities

다중 객체 탐지 및 추적 시스템 구축Multi-object detection and tracking system
- YOLOX 기반 인원 탐지 및 ByteTrack 기반 다중 객체 추적 파이프라인 구현
- 다수 인원이 밀집된 CCTV 환경에서도 안정적인 ID 유지 성능 확보
- Person detection with YOLOX and multi-object tracking with ByteTrack
- Stable ID retention even in crowded CCTV scenes
행동 인식 및 활동 분석Action recognition and activity analysis
- SlowR50 모델을 활용한 행동 분류 및 활동 상태 분석
- 객체 위치 정보를 기반으로 이동 경로, 체류 히트맵, 소비 칼로리 추정 기능 구현
- Action classification and activity-state analysis with a SlowR50 model
- Movement paths, dwell heatmaps, and calorie estimation from object positions
Re-ID 문제 완화 알고리즘 개발Re-ID mitigation algorithm
- CCTV 영상 내 객체 ID 스위칭 문제 분석
- Bluetooth RSSI 기반 위치 정보를 결합한 멀티모달 보정 알고리즘 설계
- ID 유지 정확도 향상 및 장기 추적 안정성 개선
- Analysis of object ID-switching in CCTV footage
- Multimodal correction algorithm combining Bluetooth RSSI location data
- Improved ID-retention accuracy and long-term tracking stability
프라이버시 보호 및 서비스 적용Privacy protection and service integration
- Bird-Eye-View(BEV) 변환을 통한 영상 비식별화 처리
- De-identification via Bird's-Eye-View (BEV) transformation

🏆 [은상] K-디지털 챌린지 : NET 챌린지 캠프 시즌10 (과학기술정보통신부 주최)[Silver Prize] K-Digital Challenge: NET Challenge Camp Season 10 (hosted by the Ministry of Science and ICT)

GitHubGitHub →YouTubeYouTube →

Injae Ryou

Portfolio

SIGMOID AI

SIGMOID LLM Manager

Code2Chart

sLLM Benchmark

Distributed Human Activity Recognition System for Scalable Wi-Fi Sensing

GuardianWatch