[PDF][PDF] Xgboost: extreme gradient boosting
…, K Chen, R Mitchell, I Cano, T Zhou - … version 0.4-2, 2015 - cran.ms.unimelb.edu.au
This is an introductory document of using the xgboost package in R. xgboost is short for
eXtreme Gradient Boosting package. It is an efficient and scalable implementation of gradient …
eXtreme Gradient Boosting package. It is an efficient and scalable implementation of gradient …
Godec: Randomized low-rank & sparse matrix decomposition in noisy case
Low-rank and sparse structures have been profoundly studied in matrix completion and
compressed sensing. In this paper, we develop "Go Decomposition" (GoDec) to efficiently and …
compressed sensing. In this paper, we develop "Go Decomposition" (GoDec) to efficiently and …
Disan: Directional self-attention network for rnn/cnn-free language understanding
Recurrent neural nets (RNN) and convolutional neural nets (CNN) are widely used on NLP
tasks to capture the long-term and local dependencies, respectively. Attention mechanisms …
tasks to capture the long-term and local dependencies, respectively. Attention mechanisms …
Fedproto: Federated prototype learning across heterogeneous clients
Heterogeneity across clients in federated learning (FL) usually hinders the optimization
convergence and generalization performance when the aggregation of clients' knowledge …
convergence and generalization performance when the aggregation of clients' knowledge …
Deja vu: Contextual sparsity for efficient llms at inference time
Large language models (LLMs) with hundreds of billions of parameters have sparked a new
wave of exciting AI applications. However, they are computationally expensive at inference …
wave of exciting AI applications. However, they are computationally expensive at inference …
Federated learning from pre-trained models: A contrastive learning approach
Federated Learning (FL) is a machine learning paradigm that allows decentralized clients to
learn collaboratively without sharing their private data. However, excessive computation …
learn collaboratively without sharing their private data. However, excessive computation …
H2o: Heavy-hitter oracle for efficient generative inference of large language models
… [100] Jan van den Brand, Zhao Song, and Tianyi Zhou. Algorithm and hardness for dynamic
attention maintenance in large language models. arXiv preprint arXiv:2304.02207, 2023. …
attention maintenance in large language models. arXiv preprint arXiv:2304.02207, 2023. …
Structure-augmented text representation learning for efficient knowledge graph completion
Human-curated knowledge graphs provide critical supportive information to various natural
language processing tasks, but these graphs are usually incomplete, urging auto-completion …
language processing tasks, but these graphs are usually incomplete, urging auto-completion …
Manifold elastic net: a unified framework for sparse dimension reduction
It is difficult to find the optimal sparse solution of a manifold learning based dimensionality
reduction algorithm. The lasso or the elastic net penalized manifold learning based …
reduction algorithm. The lasso or the elastic net penalized manifold learning based …
Bi-directional block self-attention for fast and memory-efficient sequence modeling
Recurrent neural networks (RNN), convolutional neural networks (CNN) and self-attention
networks (SAN) are commonly used to produce context-aware representations. RNN can …
networks (SAN) are commonly used to produce context-aware representations. RNN can …