ML Applications

Ads Ranking System

Machine Learning for Snapchat Ad Ranking

Characteristic

Serving the right ad to the right user at the right time^[8]
Optimizing for conversions^[2]
Ads targeting^[9], e.g., Advertiser select a specific user segment to display their ads, like young male adults (18-29 years old) living in Western US, etc.
Hierarchical structure/knowledge graph, i.e., Advertiser -> Campaign Groups -> Campaigns -> Ads
Fast-changing ad inventory, i.e., old ad campaigns expire and new ad campaigns start in a fast pace^[8]
A very high throughput read-time ad auction, bid*pCTR
Training, inference and model architecture for strict latency constraints

Design Overview

Problem Definition/Formulation
Metric Measurement
- Offline
  - Recall
  - AUC
  - Normalized Cross Entropy (NCE)^[3]
  - CTR calibration
  - nDCG
- Online
  - CTR
  - Revenue
  - nDCG
Data Collection
- Positive Samples
- Negative Samples
Feature Enginneering
- Feature examples:
  - Content-wise ones: NLP, CV singals w.r.t various formats, e.g., the Ad text descriptions, images, videos and short-form videos, etc.
  - Domain-based ones: App installs, purchases and sign-ups, etc.
  - User-side ones: demographic features, e.g., gender, age, location, languages; interests, historical engaged Ads/Feed/Story/SFV and search history, etc.
  - Context-wise ones: Ad id, time of day, day of week, the device, the ad placement information, how and where the ad is shown^[4]
- Feature types: Continuous, OneHot, Indexed(id features, e.g., ad id, video id, etc), Hash-OneHot(one-hot data with unbounded vocabulary size), Hash-Indexed(indexed data with unbounded vocabulary size)^[7], dense vector
Model Training
- Retrieval models
  - To fetch the most relevant ad candidates, and to reduce the cardinality of candidates, e.g., from millions to a few thousands or hundreds
    - Search Advertising: the user specifies a query and the query is the necessary and important context to match ads, e.g., Google search ads^[2]
    - Display Advertising: the user browses a publisher’s webpage and the user attribute information plays a more important role since the user intent is typically weak, e.g., FB/Tiktok/Snapchat/Pinterest/LinkedIn ads^[4]
- Ranking models
  - Main task: predicting the CTR and estimating organic utility^[5] for the retrieved candidate Ads
  - Auxiliary tasks: predicting the likelihood of app installs, purchases and sign-ups
  - point-wise, pair-wise and list-wise ranking
  - Model Types:
    - Logistic Regression, GBDT, Multi-Task DNN, Deep&Cross
- Ad auction models
  - To optimize the final positions by various business rules, e.g., bids for the ads, remaining budgets for the ads.
Model Evaluation/Serving
- For CTR prediction, NCE measures the goodness of the predictions and implicitly reflects calibration.
- Online A/B Test
Model Performance Monitoring
Challenges
- Training, inference and model architecture for strict latency constraints.
- Selection bias due to ads conversion logs
- CTR Calibrated predictions^[6]
  - Calibration is the ratio of the average estimated CTR and empirical CTR, the less the calibration differs from 1, the better the model is.
- Delayed and repeated conversions, e.g., purchase event could take a few days/weeks after the ad is shown to the user. It implies that high-quality training data is available only after a long delay.
  - Fresher training data leads to more accurate predictions.
  - On the training data side, it calls for a properly tuned(the length of the waiting time window) online streming data joiner^[3].
- Imbalanced positive/negative samples
  - Uniform subsampling
  - Negative downsampling
    - Note that the model needs to calibrate the prediction in the downsampling space

1.Paper 2007 Internet Advertising and the Generalized Second-Price Auction ↩
2.Paper 2013 Google Ad Click Prediction a View from the Trenches ↩
3.Paper 2014 Facebook Practical Lessons from Predicting Clicks on Ads at Facebook ↩
4.Paper 2014 LinkedIn LASER: A Scalable Response Prediction Platform For Online Advertising ↩
5.Paper 2020 LinkedIn Ads Allocation in Feed via Constrained Optimization ↩
6.Paper 2017 On calibration of modern neural networks ↩
7.Pinterest How We Use AutoML, Multi-Task Learning and Multi-tower Models for Pinterest Ads ↩
8.Snap Machine Learning for Snapchat Ad Ranking ↩
9.Meta Audience Ad Targeting ↩