Papers
2024
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models [Blog] [Code] [Arxiv]
Fuzhao Xue, Zian Zheng, Yao Fu, Jinjie Ni, Zangwei Zheng, Wangchunshu Zhou and Yang You Accepted at International Conference on Machine Learning (ICML) 2024 (Acceptence rate: 27.5%)MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures[Arxiv] [Code] [Homepage]
Jinjie Ni *, Fuzhao Xue *, Xiang Yue *, Yuntian Deng, Mahir Shah, Kabir Jain, Graham Neubig, Yang You (* indicates core contributors) Accepted at Neural Information Processing Systems (NeurIPS) 2024 (Acceptence rate: 25.8%)
2023
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [Arxiv]
Fuzhao Xue, Yao Fu, Wangchunshu Zhou, Zangwei Zheng, Yang You Accepted at Neural Information Processing Systems (NeurIPS) 2023 (Acceptence rate: 26.1%)Adaptive Computation with Elastic Input Sequence [Arxiv] [Code] [Blog]
Fuzhao Xue, Valerii Likhosherstov, Anurag Arnab, Neil Houlsby, Mostafa Dehghani, Yang You Accepted at International Conference on Machine Learning (ICML) 2023 (Acceptence rate: 27.9%)A Study on Transformer Configuration and Training Objective [Arxiv] [Blog] [Video]
Fuzhao Xue, Jianghai Chen, Aixin Sun, Xiaozhe Ren, Zangwei Zheng, Xiaoxin He, Yongming Chen, Xin Jiang, Yang You Accepted at International Conference on Machine Learning (ICML) 2023 (Acceptence rate: 27.9%)Sequence Parallelism: Long Sequence Training from System Perspective [Arxiv] [Code] [Video]
Shenggui Li *, Fuzhao Xue * , Yongbin Li, Yang You Accepted at Association for Computational Linguistics (ACL) 2023 (* indicates equal contribution)Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline [Arxiv] [Code] [Blog]
Zangwei Zheng, Xiaozhe Ren, Fuzhao Xue, Yang Luo, Xin Jiang, Yang You Accepted at Neural Information Processing Systems (NeurIPS) 2023 (Acceptence rate: 26.1%)CowClip: Reducing CTR Prediction Model Training Time from 12 hours to 10 minutes on 1 GPU [Arxiv] [Code]
Zangwei Zheng, Pengtai Xu, Xuan Zou, Da Tang, Zhen Li, Chenguang Xi, Peng Wu, Leqi Zou, Yijie Zhu, Ming Chen, Xiangzhuo Ding, Fuzhao Xue, Ziheng Qing, Youlong Cheng, Yang You *Accepted at Association for the Advancement of Artificial Intelligence (AAAI) 2023 Distinguished Paper Award (0.13%, 12 of 8777 submissions) *Hierarchical Dialogue Understanding with Special Tokens and Turn-level Attention [Arxiv] [Code]
Xiao Liu, Jian Zhang, Heng Zhang, Fuzhao Xue, Yang You Accepted at International Conference on Learning Representations (ICLR Tiny Paper) 2023
2022
Go Wider Instead of Deeper [Arxiv] [Code]
Fuzhao Xue, Ziji Shi, Yuxuan Lou, Yong Liu, Yang You Published at Association for the Advancement of Artificial Intelligence (AAAI) 2022 (Acceptence rate: 15.0%)An Embarrassingly Simple Model for Dialogue Relation Extraction [Arxiv] [Code] [Slides] [Poster]
Fuzhao Xue, Aixin Sun, Hao Zhang, Eng-Siong Chng Published at IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022One Student Knows All Experts Know: From Sparse to Dense [Arxiv]
Fuzhao Xue, Xiaoxin He, Xiaozhe Ren, Yuxuan Lou, Yang You arXiv preprint arXiv:2201.10890Recent Advances in Deep Learning-based Dialogue Systems [Arxiv]
Jinjie Ni, Tom Young, Vlad Pandelea, Fuzhao Xue, Vinay Adiga, Erik Cambria Published at Artificial Intelligence ReviewAutomated Audio Captioning using Transfer Learning and Reconstruction Latent Space Similarity Regularization [Arxiv]
Andrew Koh, Fuzhao Xue, Eng Siong Chng Published at IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation [Arxiv] [Code]
Wangbo Zhao, Kai Wang, Xiangxiang Chu, Fuzhao Xue, Xinchao Wang, and Yang You Published at IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022 (Acceptence rate: 25.3%)
2021
GDPNet: Refining Latent Multi-View Graph for Relation Extraction [Arxiv] [Code] [Video] [Slides] [Poster]
Fuzhao Xue, Aixin Sun, Hao Zhang, Eng-Siong Chng Published at Association for the Advancement of Artificial Intelligence (AAAI) 2021 (Acceptence rate: 21.4%)Large-Scale Deep Learning Optimizations: A Comprehensive Survey [Arxiv]
Xiaoxin He, Fuzhao Xue, Xiaozhe Ren, Yang YouCross-token Modeling with Conditional Computation [Arxiv]
Yuxuan Lou, Fuzhao Xue, Zangwei Zheng, Yang YouRACP: A network with Attention Corrected Prototype for Few-shot Speaker Recognition using Indefinite Distance Metric
Xingmei Wang, Jiaxiang Meng, Bin Wen, Fuzhao Xue Published at Neurocomputing
2020
Deep Graph Random Process for Relational-Thinking-Based Speech Recognition [Arxiv]
Hengguan Huang, Fuzhao Xue, Hao Wang, Ye Wang Published at International Conference on Machine Learning (ICML) 2020 (Acceptence rate: 21.5%)A network model of speaker identification with new feature extraction methods and asymmetric BLSTM
Xingmei Wang, Fuzhao Xue, Wei Wang, Anhua Liu Published at Neurocomputing
2019
- An Underwater Acoustic Target Recognition: A Combination of Multi-dimensional Fusion Features and Modified Deep Neural Network
Xingmei Wang, Anhua Liu, Yu Zhang, Fuzhao Xue Published at Remote Sensing