Publications

Selected Publications
  1. TourMLLM: A Retrieval-Augmented Multimodal Large Language Model for Multitask Learning in the Tourism Domain, H. Yamanishi, L. Xiao* (corresponding author), and T. Yamasaki, ICMR, pp. 1654–1663, 2025, Best paper award!
    TourMLLM method overview
    Fig. 1. Overview of TourMLLM: retrieval-augmented pipeline for tourism tasks.
  2. Multi-level Knowledge Distillation for Fine-grained Fashion Image Retrieval, L. Xiao and T. Yamasaki, Knowledge-Based Systems, vol. 310, p. 112955, 2025.
  3. MKD
    Fig. 1. Details of the proposed MKD.
International Conferences (Peer-reviewed)
  1. Incorporating Semantic Visual Content into Click-Through Rate Prediction for Video Advertisements, Y. Tanabe, S. Masuda, G. Ryu, N. Tanji, H. Seshime, L. Xiao, and T. Yamasaki, The 17th Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2025), accepted, 2025.
  2. Combining Non-Numerical Text and Numerical Sequences in LLM-based Survival Prediction, Z. Zhou, G. Qian, X. Jiang, G. Wang, R. Lu, L. Xiao, and S. Tang, The 22nd Pacific Rim International Conference Series on Artificial Intelligence (PRICAI 2025), accepted, 2025.
  3. ActRecognition-GPT: Utilizing Multimodal Large Language Models for Spatiotemporal Action Recognition in Nursery Videos, K. Watanabe, S. Masuda, L. Xiao, and T. Yamasaki, FM&LLM&GM 2025 (FG 2025 Workshop), pp. 1–10, 2025.
  4. TourMLLM: A Retrieval-Augmented Multimodal Large Language Model for Multitask Learning in the Tourism Domain, H. Yamanishi, L. Xiao* (corresponding author), and T. Yamasaki, ICMR, pp. 1654–1663, 2025, Best paper award!
  5. Explainable AI for Image Aesthetic Evaluation Using Vision-Language Models, S. Viriyavisuthisakul, S.n Yoshida, K. Shiohara, L. Xiao, and T. Yamasaki, AIxMM, pp. 62–65, 2025.
  6. LITA: LMM-guided Image-Text Alignment for Art Assessment, T. Sunada, K. Shiohara, L. Xiao, and T. Yamasaki, MMM 2025, pp. 268–281, 2025.
  7. Language-Guided Self-Supervised Video Summarization Using Text Semantic Matching Considering the Diversity of the Video, T. Sugihara, S. Masuda, L. Xiao*, and T. Yamasaki, ACM Multimedia Asia 2024, pp. 1–1, 2024. [Code]
  8. LLaVA-Tour: A Large-Scale Multimodal Model Specializing in Japanese Tourist Spot Prediction and Review Generation, H. Yamanishi, L. Xiao*, and T. Yamasaki, VCIP 2024, pp. 1–5, 2024. [Best Paper Candidate] [Code]
  9. A Multimodal Dataset and Benchmark for Tourism Review Generation, H. Yamanishi, L. Xiao*, and T. Yamasaki, ACM RecSys Workshop on Recommenders in Tourism (RecTour 2024), 2024.
  10. SCOMatch: Alleviating Overtrusting in Open-set Semi-supervised Learning, Z. R. Wang, L. Y. Xiang, L. Huang, J. F. Mao, L. Xiao, and T. Yamasaki, ECCV 2024, pp. 217–233, 2024.
  11. Adversarially Robust Continual Learning with Anti-forgetting Loss, K. Mukai, S. Kumano, N. Michel, L. Xiao, and T. Yamasaki, ICIP 2024, pp. 1085–1091, 2024.
  12. E-ReaRev: Adaptive Reasoning for Question Answering over Incomplete Knowledge Graphs by Edge and Meaning Extensions, X.T. Ye, L. Xiao, C. Zhang, and T. Yamasaki, NLDB 2024, pp. 85–95, 2024.
  13. Rethinking Momentum Knowledge Distillation in Online Continual Learning, N. Michel, M. Wang, L. Xiao, and T. Yamasaki, ICML 2024, pp. 35607–35622, 2024. [Code]
  14. Boosting Fine-grained Fashion Retrieval with Relational Knowledge Distillation, L. Xiao and T. Yamasaki, CVPR 2024 Workshop (CVFAD), pp. 8229–8234, 2024. [Code]
  15. Improving Plasticity in Online Continual Learning via Collaborative Learning, M. Wang, N. Michel, L. Xiao, and T. Yamasaki, CVPR 2024, pp. 23460–23469, 2024. [Code]
  16. HetSpot: Analyzing Tourist Spot Popularity with Heterogeneous Graph Neural Network, H. Yamanishi, L. Xiao*, and T. Yamasaki, IVSP 2024, pp. 111–120, 2024.
  17. Toward a More Robust Fine-grained Fashion Retrieval, L. Xiao, X. F. Zhang, and T. Yamasaki, MIPR 2023, pp. 1–4, 2023. [Code]
  18. Learning Fashion Compatibility with Color Distortion Prediction, L. Xiao, X. F. Zhang, and T. Yamasaki, MIPR 2023, pp. 81–84, 2023.
  19. Bridging the Capacity Gap for Online Knowledge Distillation, M. Wang, H. Yu, L. Xiao, and T. Yamasaki, MIPR 2023, pp. 1–4, 2023. [Code]
  20. SAT: Self-adaptive Training for Fashion Compatibility Prediction, L. Xiao and T. Yamasaki, ICIP 2022, pp. 2431–2435, 2022.
  21. Surface Defect Detection Using Hierarchical Features, L. Xiao, T. Huang, B. Wu, Y. Hu, and J. Zhou, CASE 2019, pp. 1592–1596, 2019.
  22. A Remote Health Condition Monitoring System Based on Compressed Sensing, J. Liu, Y. Hu, Y. Lu, Y. Wang, L. Xiao, and K. Zheng, MSCE 2017, pp. 262–266, 2017.
International Journals (Peer-reviewed)
  1. GeoDCL: Weak Geometrical Distortion based Contrastive Learning for Fine-grained Fashion Image Retrieval, L. Xiao and T. Yamasaki, IEEE Transactions on Artificial Intelligence, vol. 1, pp. 1–13, 2025.
  2. Multi-level Knowledge Distillation for Fine-grained Fashion Image Retrieval, L. Xiao and T. Yamasaki, Knowledge-Based Systems, vol. 310, p. 112955, 2025.
  3. LiFSO-Net: A Lightweight Feature Screening Optimization Network for Complex-scale Flat Metal Defect Detection, Hao Zhong, L. Xiao, Haifeng Wang, Xin Zhang, Chenhui Wan, and Bo Wu, Knowledge-Based Systems, vol. 304, p. 112520, 2024.
  4. Attribute-Guided Multi-Level Attention Network for Fine-Grained Fashion Retrieval, L. Xiao and T. Yamasaki, IEEE Access, vol. 12, pp. 48068–48080, 2024. [Code]
  5. STFE-Net: A Multi-stage Approach to Enhance Statistical Texture Feature for Defect Detection on Metal Surfaces, H. Zhong, D. X. Fu, L. Xiao, F. Zhao, J. Liu, B. Wu, and Y. M. Hu, Advanced Engineering Informatics, vol. 61, p. 102437, 2024.
  6. Missing Small Fastener Detection Using Deep Learning, L. Xiao, B. Wu, and Y. Hu, IEEE Transactions on Instrumentation and Measurement, vol. 70, pp. 1–9, 2020.
  7. OSED: Object-specific Edge Detection, L. Xiao, B. Wu, and Y. Hu, Journal of Visual Communication and Image Representation, vol. 72, p. 102918, 2020.
  8. Detection of Powder Bed Defects in Selective Laser Sintering Using Convolutional Neural Network, L. Xiao, M. Lu, and H. Huang, International Journal of Advanced Manufacturing Technology, vol. 107, pp. 2485–2496, 2020.
  9. A Hierarchical Features-based Model for Freight Train Defect Inspection, L. Xiao, B. Wu, Y. Hu, and J. Liu, IEEE Sensors Journal, vol. 20(5), pp. 2671–2678, 2019.
  10. Surface Defect Detection Using Image Pyramid, L. Xiao, B. Wu, and Y. Hu, IEEE Sensors Journal, vol. 20(13), pp. 7181–7188, 2020.
arXiv Papers
  1. LLM-Advisor: An LLM Benchmark for Cost-efficient Path Planning across Multiple Terrains, L. Xiao and T. Yamasaki, arXiv:2503.01236, 2025.
  2. Language-Guided Self-Supervised Video Summarization Using Text Semantic Matching Considering the Diversity of the Video, T. Sugihara, S. Masuda, L. Xiao, and T. Yamasaki, arXiv:2405.08890, 2024.
  3. Rethinking Momentum Knowledge Distillation in Online Continual Learning, N. Michel, M. Wang, L. Xiao, and T. Yamasaki, arXiv:2309.02870, 2023.
  4. Online Open-set Semi-supervised Object Detection via Semi-supervised Outlier Filtering, Z. Wang, L. Xiao, L. Xiang, Z. Weng, and T. Yamasaki, arXiv:2305.13802, 2023.
  5. MetaMixer: A Regularization Strategy for Online Knowledge Distillation, M. Wang, L. Xiao, and T. Yamasaki, arXiv:2303.07951, 2023.
  6. Semi-supervised Fashion Compatibility Prediction by Color Distortion Prediction, L. Xiao and T. Yamasaki, arXiv:2212.14680, 2022.
  7. Attribute-Guided Multi-Level Attention Network for Fine-Grained Fashion Retrieval, L. Xiao and T. Yamasaki, arXiv:2301.13014, 2022.
Domestic Conferences
  1. Enhancing the Spatial Awareness of Large Language Models in Path Planning, Ling Xiao and Toshihiko Yamasaki, 第30回 知能メカトロニクスワークショップ 2025 (iMec), 2025.
  2. Few-shot推論によるアノテータに個人適応可能なビデオ要約, 杉原朋弥, 増田俊太郎, 肖玲, 山崎俊彦, MIRU 2025, IS3-102, 2025.
  3. 時空間情報を統合したプロンプトを用いた保育施設映像の行動認識, 渡辺健太, 増田俊太郎, 肖玲, 山崎俊彦, MIRU 2025, IS2-081, 2025.
  4. LLM-Advisor: Leveraging LLMs as Advisors for Cost-efficient Path Planning Across Diverse Terrains, Ling Xiao and Toshihiko Yamasaki, MIRU 2025, IS2-185, 2025.
  5. TourMLLM: 検索拡張大規模観光マルチモーダルモデル, 山西博雅, Ling Xiao, 山崎俊彦, MIRU 2025, IS2-119, 2025.
  6. 基盤モデルによる視覚的評価を用いた動画広告の効果分析, 田邉克晃, 増田俊太郎, 劉岳松, 丹治直人, 勢〆弘幸, 肖玲, 山崎俊彦, MIRU 2025, OS2C-06, 2025. [Oral]
  7. Content-Aware Layout Generation with Large Language Models, Chen FU, Naoto Tanji, Gakumatsu Ryu, Hiroyuki SESHIME, Shengzhou Yi, Ling Xiao, and Toshihiko Yamasaki, MIRU 2025, IS1-102, 2025.
  8. タスク適応的検索拡張学習に基づく観光特化大規模マルチモーダルモデル, 山西博雅, 肖 玲, 山崎俊彦, 信学技報, 画像工学研究会 (IE), IE2024-61.
  9. Explainable Image Aesthetic Assessment Leveraging Vision-Language Models, S. Viriyavisuthisakul, S.n Yoshida, K. Shiohara, L. Xiao, and T. Yamasaki, 信学技報, 画像工学研究会 (IE), IE2024-66.
  10. Momentum Knowledge Distillation for Enhanced Online Continual Learning, N. Michel, M. Wang, L. Xiao, and T. Yamasaki, 信学技報, 画像工学研究会 (IE), IE2024-57.
  11. Llava-Planner: Enhancing Spatial Awareness of LLaVA for Cost-Effective Path Planning, L. Xiao, H. Yamanishi, and T. Yamasaki, 信学技報, 画像工学研究会 (IE), IE2024-44.
  12. LLM-Advisor: A LLM Benchmark for Cost-effective Path Planning, L. Xiao and T. Yamasaki, PCSJ/IMPS 2024, P-2-05, 2024.
  13. マルチモーダル観光レビュー生成データセットと大規模レビュー生成モデルの作成, H. Yamanishi, L. Xiao, and T. Yamasaki, PCSJ/IMPS 2024, P-4-18, 2024.
  14. Boosting Fine-grained Fashion Retrieval with Relational Knowledge Distillation, L. Xiao and T. Yamasaki, 信学技報, 画像工学研究会 (IE), vol. 124, no. 60, IE2024-17, pp. 90–94, 2024. [Code]
  15. Language-Guided Self-Supervised Video Summarization Using Text Semantic Matching, T. Sugihara, S. Masuda, L. Xiao, and T. Yamasaki, MIRU 2024. [Oral]
  16. 大規模マルチモーダルモデルを用いた広告画像の評価・改善, 砂田達巳, 塩原楓, 劉岳松, 丹治直人, 勢〆弘幸, 肖玲, 山崎俊彦, MIRU 2024. [Oral]
  17. Multi-hop Question Answering over Incomplete Knowledge Graphs by Edge and Meaning Extensions, X.T. Ye, L. Xiao, C. Zhang, and T. Yamasaki, MIRU 2024.
  18. Constrianed Advertisement Layout Generation based on Graph Neural Networks, C. Fu, Y. Liu, N. Tanji, H. Seshime, L. Xiao, and T. Yamasaki, MIRU 2024.
  19. Improving Adversarial Robustness in Continual Learning, K. Mukai, S. Kumano, N. Michel, L. Xiao, and T. Yamasaki, 信学技報, 画像工学研究会 (IE), vol. 123, no. 381, IE2023-37, pp. 13–18, 2024. [IE賞]
  20. 大規模言語モデルを活用した自己教師あり学習によるビデオ要約, 杉原朋弥, 増田俊太郎, 肖玲, 山崎俊彦, IPSJ, 7T-06, pp. 2-653–2-654, 2024.
  21. Advertisement Layout Generation based on Graph Neural Network, C. Fu, Y. Liu, N. Tanji, H. Seshime, L. Xiao, and T. Yamasaki, 信学技報, 画像工学研究会 (IE), vol. 123, no. 381, IE2023-51, pp. 88–89, 2024.
  22. Improved Fine-grained Fashion Retrieval with Contrastive Learning, L. Xiao, X. F. Zhang, and T. Yamasaki, MIRU 2023, IS3-55, 2023.
  23. Video Summarization Based on Masked Autoencoder, M. L. A. FOK, L. Xiao, and T. Yamasaki, MIRU 2023, IS1-84, 2023.
  24. Improving Fashion Compatibility Prediction with Color Distortion Prediction, L. Xiao and T. Yamasaki, 信学技報, 画像工学研究会 (IE), vol. 122, no. 385, IE2022-61, pp. 17–18, 2023.
  25. Multi-Level Attention Network for Fine-Grained Fashion Retrieval, L. Xiao and T. Yamasaki, 信学技報, MVE, vol. 122, no. 440, MVE2022-90, pp. 198–199, 2023.
  26. SAT: Self-adaptive Training for Fashion Compatibility Prediction, L. Xiao and T. Yamasaki, MIRU 2022.
  27. Spatial Attention Based Fashion Compatibility Prediction, L. Xiao and T. Yamasaki, PCSJ/IMPS 2021, P-3-17, pp. 135–136, 2021.
Patents (China)
  1. 一种用于静脉穿刺的穿刺靶点识别与定位方法, 肖玲、欧阳浩、叶霖、韩斌、陈学东、杨新 (发明专利,专利号: 202210202422.1,申请中)
  2. 一种钢卷双目视觉定位方法及设备, 胡友民、肖玲、吴波 (发明专利,专利号: 201810094718.X,授权日:2020.09.18)
  3. 一种基于视觉的钢卷定位方法及设备, 胡友民、肖玲、吴波 (发明专利,专利号: 201811059328.5,授权日:2020.07.10)
  4. 一种可视化的起重机吊取定位系统, 胡友民、肖玲、吴波、刘颉 (发明专利,专利号: 201611246219.5,授权日:2018.01.02)
  5. 一种焊接熔池动态过程的在线监测系统及方法, 胡友民、刘颉、肖玲、唐松、谷勇 (发明专利,专利号: 201610288460.8,授权日:2018.06.12)
  6. 一种用于焊接熔池在线监测平台的多功能夹具, 胡友民、唐松、肖玲、谷勇、刘颉 (实用新型专利,专利号: 201620434683.6,授权日:2016.10.05)
  7. 一种针对光流图的快速的FCM图像分割方法, 胡友民、胡中旭、吴波、武敏健、刘颉、肖玲、王诗杰、李雪莲 (发明专利,专利号: 201710530461.3,授权日:2019.11.12) (2023年度湖北省科学技术奖提名)