Zi Yin

Machine Learning · NLP · Information Theory

I work in both theory [statistics/information theory] and engineering [computer systems]. I was affiliated to the Stanford Statistical Machine Learning Group, Stanford Platform Lab, and the Information Systems Laboratory (ISL).

After leaving academia, I became a speculator, mostly expressing my semi-systematic market views on commodities, taking directional and spread risks across major energy, metals, and agriculture markets. With statistical methods, machine learning, and more recently the new wave of agentic AI, I aim to competitively serve the market and its participants — contributing to better price discovery and more efficient risk transfer.

Research

I focus on machine learning and natural language processing. While I no longer engage in frontline large language model research, my earlier work contributed to both academic and industrial applications of language models, including some of the earliest integrations of LLMs into search engines in 2016. I broadly identify with the machine learning research community, particularly COLT, ICLR, ICML, and NeurIPS, and maintain strong interests in statistics and information theory. For a period in the past, I also conducted research in systems and networking.

My research has received recognition in both academia and industry. Data center networking research was featured on the front page of The New York Times, and has led to the creation of multiple startups. In machine learning, my work in natural language processing has been taught in Stanford University's widely attended CS224n course, led by Professor Christopher Manning.

I serve as a reviewer for several major conferences, including KDD, NeurIPS, ICLR, and ICML.

Google Scholar Profile →

Publications

"SIMON: A Simple and Scalable Method for Sensing, Inference and Measurement in Data Center Networks"

Yilong Geng, Shiyu Liu, Zi Yin, Ashish Naik, Balaji Prabhakar, Mendel Rosenblum and Amin Vahdat

16th USENIX Symposium on Networked Systems Design and Implementation (NSDI), Boston, MA, 2019

paper
"On the Dimensionality of Word Embedding"

Zi Yin and Yuanyuan Shen

Advances in Neural Information Processing Systems (NeurIPS), Montreal, 2018
Oral presentation — top 0.6% of submissions
paper code slides video