ByteDance


Live streams recommendation

Jun '20 - Aug '20

bytedance logo

Live Streams Recommendation, Graph Embedding:

  • Enhanced internal ML trainer (C++) performance by mitigating communication overhead. Reduced mini-batch forward latency by 40%+ in certain graph embedding training circumstances.

  • Extracted relations from Petabyte log data and built distributed user-author graphs using MapReduce. Devised graph encoders and trained an embedding of graph nodes using Tensorflow and ByteDance ML API.

  • Integrated end-to-end embedding to Click-Through-Rate prediction. Increased online user staytime (+3.5%), comment rate (+3.8%), and other metrics significantly in AB tests for Douyin (TikTok for Chinese market).

Systems for Engineering Efficiency:

  • Designed and developed a system from scratch in Django with RESTful APIs that creates and manages alerts for 100+ online models of 5 products and presents model health status on a dashboard. Attracted 70+ internal users and developers. Decreased the usual day-long response time to <1hr in a recent dataflow accident.

  • Constructed a pipeline that analyzes importance of 300+ features in a Monte Carlo fashion and performs feature modification on Terabyte model checkpoints on clusters based on results. Saved 2hr+ manual labor per iteration and 35k+ core-hour computing resources in total than hand-tuning.


This site is a migration from my old personal page and is still under active construction. - Jan 2021

Modifications © Tianyu Zhang 2021. Original source © R. Miles McCain 2020. Content is licensed CC BY-SA 4.0, a Free Culture License. The source code is available under GPLv3.