User profiles for Xiangrui Meng

Xiangrui Meng

Databricks
Verified email at databricks.com
Cited by 8855

Scalable simple random sampling and stratified sampling

X Meng - International conference on machine learning, 2013 - proceedings.mlr.press
Analyzing data sets of billions of records has now become a regular task in many companies
and institutions. In the statistical analysis of those massive data sets, sampling generally …

Apache spark: a unified engine for big data processing

…, T Das, M Armbrust, A Dave, X Meng… - Communications of the …, 2016 - dl.acm.org
… XIN, PATRICK WENDELL, TATHAGATA DAS, MICHAEL ARMBRUST, ANKUR DAVE,
XIANGRUI MENG, JOSH ROSEN, SHIVARAM VENKATARAMAN, MICHAEL J. FRANKLIN, ALI …

Spark sql: Relational data processing in spark

…, C Lian, Y Huai, D Liu, JK Bradley, X Meng… - Proceedings of the …, 2015 - dl.acm.org
Spark SQL is a new module in Apache Spark that integrates relational processing with
Spark's functional programming API. Built on our experience with Shark, Spark SQL lets Spark …

Mllib: Machine learning in apache spark

X Meng, J Bradley, B Yavuz, E Sparks… - Journal of Machine …, 2016 - jmlr.org
On-line portfolio selection is a practical financial engineering problem, which aims to sequentially
allocate capital among a set of assets in order to maximize long-term return. In recent …

Matrix computations and optimization in apache spark

R Bosagh Zadeh, X Meng, A Ulanov, B Yavuz… - Proceedings of the …, 2016 - dl.acm.org
We describe matrix computations available in the cluster programming framework, Apache
Spark. Out of the box, Spark provides abstractions and implementations for distributed …

Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression

X Meng, MW Mahoney - Proceedings of the forty-fifth annual ACM …, 2013 - dl.acm.org
Low-distortion embeddings are critical building blocks for developing random sampling and
random projection algorithms for common linear algebra problems. We show that, given a …

Wisdom of the better few: cold start recommendation via representative based rating elicitation

NN Liu, X Meng, C Liu, Q Yang - … of the fifth ACM conference on …, 2011 - dl.acm.org
Recommender systems have to deal with the cold start problem as new users and/or items
are always present. Rating elicitation is a common approach for handling cold start. However, …

[HTML][HTML] Mapping ICD-10 and ICD-10-CM codes to phecodes: workflow development and initial evaluation

P Wu, A Gifford, X Meng, X Li, H Campbell… - JMIR medical …, 2019 - medinform.jmir.org
Background: The phecode system was built upon the International Classification of Diseases,
Ninth Revision, Clinical Modification (ICD-9-CM) for phenome-wide association studies (…

Serum uric acid levels and multiple health outcomes: umbrella review of evidence from observational studies, randomised controlled trials, and Mendelian …

X Li, X Meng, M Timofeeva, I Tzoulaki, KK Tsilidis… - Bmj, 2017 - bmj.com
Objective To map the diverse health outcomes associated with serum uric acid (SUA) levels.
Design Umbrella review. Data sources Medline, Embase, Cochrane Database of …

[HTML][HTML] Chemoresistance mechanisms of breast cancer and their countermeasures

X Ji, Y Lu, H Tian, X Meng, M Wei, WC Cho - Biomedicine & …, 2019 - Elsevier
Chemoresistance is one of the major challenges for the breast cancer treatment. Owing to its
heterogeneous nature, the chemoresistance mechanisms of breast cancer are complicated, …