TY - GEN
T1 - TelcoGPT
T2 - 2025 IEEE Conference on Computer Communications Workshops, INFOCOM WKSHPS 2025
AU - Khan, Muhammad Zakir
AU - Ge, Yao
AU - Ullah, Ubaid
AU - Ansari, Shuja
AU - Imran, Muhamamd
AU - Abbasi, Qammer H.
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - This paper presents TelcoGPT, a specialised question-answering (Q&A) and code retrieval system for telecommunications that combines retrieval-augmented generation (RAG) with domain-specific optimizations. TelcoGPT introduces three key enhancements: (1) a HybridEmbedding method-ology integrating text-embedding models with telecom-specific filtering mechanisms; (2) an advanced document processing pipeline with adaptive chunking and technical term density scoring; and (3) a dual-path query engine optimized for both question-answering and code retrieval tasks. Evaluation on the RedPajama-Data-1T arxiv subset demonstrates that hybrid embedding approach achieves mean reciprocal rank (MRR) of 0.89 and hit rate (HR) of 0.94 with optimal configuration (thresh-old=0.8, chunk size=12K, k=15), outperforming single-embedding approaches by 5-7%. The hybrid RAG implementation increases MRR by 8.5% (0.82 to 0.89) and HR by 10.6% (0.85 to 0.94). TelcoGPT achieves 95% accuracy in domain-specific Q&A tasks versus 87% for base models, while maintaining higher technical term density scores (0.90 vs 0.81). For code retrieval, our system demonstrates 93% execution success rate with comprehensive error handling, surpassing baseline approaches by 6-8%. Comparative analysis with GPT-3.5, GPT-4, and LLAMA-2/3 shows significant improvements in context relevance (0.92 vs 0.84), information accuracy (0.95 vs 0.89), faithfulness (0.84 to 0.92), and relevancy (0.83 to 0.93), demonstrating the effectiveness of our architecture for telecommunications applications.
AB - This paper presents TelcoGPT, a specialised question-answering (Q&A) and code retrieval system for telecommunications that combines retrieval-augmented generation (RAG) with domain-specific optimizations. TelcoGPT introduces three key enhancements: (1) a HybridEmbedding method-ology integrating text-embedding models with telecom-specific filtering mechanisms; (2) an advanced document processing pipeline with adaptive chunking and technical term density scoring; and (3) a dual-path query engine optimized for both question-answering and code retrieval tasks. Evaluation on the RedPajama-Data-1T arxiv subset demonstrates that hybrid embedding approach achieves mean reciprocal rank (MRR) of 0.89 and hit rate (HR) of 0.94 with optimal configuration (thresh-old=0.8, chunk size=12K, k=15), outperforming single-embedding approaches by 5-7%. The hybrid RAG implementation increases MRR by 8.5% (0.82 to 0.89) and HR by 10.6% (0.85 to 0.94). TelcoGPT achieves 95% accuracy in domain-specific Q&A tasks versus 87% for base models, while maintaining higher technical term density scores (0.90 vs 0.81). For code retrieval, our system demonstrates 93% execution success rate with comprehensive error handling, surpassing baseline approaches by 6-8%. Comparative analysis with GPT-3.5, GPT-4, and LLAMA-2/3 shows significant improvements in context relevance (0.92 vs 0.84), information accuracy (0.95 vs 0.89), faithfulness (0.84 to 0.92), and relevancy (0.83 to 0.93), demonstrating the effectiveness of our architecture for telecommunications applications.
KW - Retrieval-augmented generation
KW - code generation
KW - domain adaptation
KW - hybrid embeddings
KW - telecommunications
UR - https://www.scopus.com/pages/publications/105017953949
U2 - 10.1109/INFOCOMWKSHPS65812.2025.11152931
DO - 10.1109/INFOCOMWKSHPS65812.2025.11152931
M3 - Conference contribution
AN - SCOPUS:105017953949
T3 - IEEE Conference on Computer Communications Workshops, INFOCOM WKSHPS 2025
BT - IEEE Conference on Computer Communications Workshops, INFOCOM WKSHPS 2025
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 19 May 2025
ER -