About me

I am currently a Senior Researcher at Tencent AI Lab in Bellevue, WA, where I focus on the research of generative AI topics related to Large Language Models (LLMs), Large Vision Language Models (LVLMs), and Agent Systems, etc. Prior to the current role, I worked as a Researcher in Bell Labs in Murray Hill, NJ, where my research topics are related to deep learning and video understanding.

Contact me

Email: xiaoyang.wangs AT gmail DOT com

Research Highlights

  • Persona Hub: Scaling synthetic data creation for LLM post-training with 1 billion personas. Part of the synthetic personas and synthetic data are released.

  • Multi-Modal Chart (MMC): Enabling LVLMs with chart understanding using synthetic vision-language data from arXiv. Training and benchmark data are released.

  • DSBench: How far are LLM-based data science agents to becoming data science experts? Our study gives the first glimpse with the benchmark data released.