Papers

2025

LongVILA: Scaling Long-Context Visual Language Models for Long Videos [Code] [Arxiv]
Yukang Chen *, Fuzhao Xue *, Dacheng Li *, Qinghao Hu *, Ligeng Zhu, Xiuyu Li, Yunhao Fang, Haotian Tang, Shang Yang, Zhijian Liu, Ethan He, Hongxu Yin, Pavlo Molchanov, Jan Kautz, Linxi Fan, Yuke Zhu, Yao Lu, Song Han Accepted at International Conference on Learning Representations (ICLR) 2025 (* indicates equal contribution)
MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures [Arxiv] [Homepage]
Jinjie Ni, Yifan Song, Deepanway Ghosal, Bo Li, David Junhao Zhang, Xiang Yue, Fuzhao Xue, Zian Zheng, Kaichen Zhang, Mahir Shah, Kabir Jain, Yang You, Michael Shieh Accepted at International Conference on Learning Representations (ICLR Spotlights) 2025
QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation [Homepage] [ArXiv] [Code]
Yue Zhao, Fuzhao Xue, Scott Reed, Linxi Fan, Yuke Zhu, Jan Kautz, Zhiding Yu, Philipp Krähenbühl, De-An Huang arXiv preprint arXiv:2201.10890

2024

OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models [Blog] [Code] [Arxiv]
Fuzhao Xue, Zian Zheng, Yao Fu, Jinjie Ni, Zangwei Zheng, Wangchunshu Zhou and Yang You Accepted at International Conference on Machine Learning (ICML) 2024
MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures[Arxiv] [Code] [Homepage]
Jinjie Ni *, Fuzhao Xue *, Xiang Yue *, Yuntian Deng, Mahir Shah, Kabir Jain, Graham Neubig, Yang You (* indicates core contributors) Accepted at Neural Information Processing Systems (NeurIPS) 2024

2023

To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [Arxiv]
Fuzhao Xue, Yao Fu, Wangchunshu Zhou, Zangwei Zheng, Yang You Accepted at Neural Information Processing Systems (NeurIPS) 2023
Adaptive Computation with Elastic Input Sequence [Arxiv] [Code] [Blog]
Fuzhao Xue, Valerii Likhosherstov, Anurag Arnab, Neil Houlsby, Mostafa Dehghani, Yang You Accepted at International Conference on Machine Learning (ICML) 2023
A Study on Transformer Configuration and Training Objective [Arxiv] [Blog] [Video]
Fuzhao Xue, Jianghai Chen, Aixin Sun, Xiaozhe Ren, Zangwei Zheng, Xiaoxin He, Yongming Chen, Xin Jiang, Yang You Accepted at International Conference on Machine Learning (ICML) 2023
Sequence Parallelism: Long Sequence Training from System Perspective [Arxiv] [Code] [Video]
Shenggui Li *, Fuzhao Xue * , Yongbin Li, Yang You Accepted at Association for Computational Linguistics (ACL) 2023 (* indicates equal contribution)
Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline [Arxiv] [Code] [Blog]
Zangwei Zheng, Xiaozhe Ren, Fuzhao Xue, Yang Luo, Xin Jiang, Yang You Accepted at Neural Information Processing Systems (NeurIPS) 2023
CowClip: Reducing CTR Prediction Model Training Time from 12 hours to 10 minutes on 1 GPU [Arxiv] [Code]
Zangwei Zheng, Pengtai Xu, Xuan Zou, Da Tang, Zhen Li, Chenguang Xi, Peng Wu, Leqi Zou, Yijie Zhu, Ming Chen, Xiangzhuo Ding, Fuzhao Xue, Ziheng Qing, Youlong Cheng, Yang You *Accepted at Association for the Advancement of Artificial Intelligence (AAAI) 2023 Distinguished Paper Award (0.13%, 12 of 8777 submissions) *
Hierarchical Dialogue Understanding with Special Tokens and Turn-level Attention [Arxiv] [Code]
Xiao Liu, Jian Zhang, Heng Zhang, Fuzhao Xue, Yang You Accepted at International Conference on Learning Representations (ICLR Tiny Paper) 2023

2022

Go Wider Instead of Deeper [Arxiv] [Code]
Fuzhao Xue, Ziji Shi, Yuxuan Lou, Yong Liu, Yang You Published at Association for the Advancement of Artificial Intelligence (AAAI) 2022
An Embarrassingly Simple Model for Dialogue Relation Extraction [Arxiv] [Code] [Slides] [Poster]
Fuzhao Xue, Aixin Sun, Hao Zhang, Eng-Siong Chng Published at IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
One Student Knows All Experts Know: From Sparse to Dense [Arxiv]
Fuzhao Xue, Xiaoxin He, Xiaozhe Ren, Yuxuan Lou, Yang You arXiv preprint arXiv:2201.10890
Recent Advances in Deep Learning-based Dialogue Systems [Arxiv]
Jinjie Ni, Tom Young, Vlad Pandelea, Fuzhao Xue, Vinay Adiga, Erik Cambria Published at Artificial Intelligence Review
Automated Audio Captioning using Transfer Learning and Reconstruction Latent Space Similarity Regularization [Arxiv]
Andrew Koh, Fuzhao Xue, Eng Siong Chng Published at IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation [Arxiv] [Code]
Wangbo Zhao, Kai Wang, Xiangxiang Chu, Fuzhao Xue, Xinchao Wang, and Yang You Published at IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022

2021

GDPNet: Refining Latent Multi-View Graph for Relation Extraction [Arxiv] [Code] [Video] [Slides] [Poster]
Fuzhao Xue, Aixin Sun, Hao Zhang, Eng-Siong Chng Published at Association for the Advancement of Artificial Intelligence (AAAI) 2021
Large-Scale Deep Learning Optimizations: A Comprehensive Survey [Arxiv]
Xiaoxin He, Fuzhao Xue, Xiaozhe Ren, Yang You
Cross-token Modeling with Conditional Computation [Arxiv]
Yuxuan Lou, Fuzhao Xue, Zangwei Zheng, Yang You
RACP: A network with Attention Corrected Prototype for Few-shot Speaker Recognition using Indefinite Distance Metric
Xingmei Wang, Jiaxiang Meng, Bin Wen, Fuzhao Xue Published at Neurocomputing

2020

Deep Graph Random Process for Relational-Thinking-Based Speech Recognition [Arxiv]
Hengguan Huang, Fuzhao Xue, Hao Wang, Ye Wang Published at International Conference on Machine Learning (ICML) 2020
A network model of speaker identification with new feature extraction methods and asymmetric BLSTM
Xingmei Wang, Fuzhao Xue, Wei Wang, Anhua Liu Published at Neurocomputing

2019

An Underwater Acoustic Target Recognition: A Combination of Multi-dimensional Fusion Features and Modified Deep Neural Network
Xingmei Wang, Anhua Liu, Yu Zhang, Fuzhao Xue Published at Remote Sensing

Fuzhao Xue

Papers

2025

2024

2023

2022

2021

2020

2019