Ads Ranking System
Characteristic
- Serving the right ad to the right user at the right time[8]
- Optimizing for conversions[2]
- Ads targeting[9], e.g., Advertiser select a specific user segment to display their ads, like young male adults (18-29 years old) living in Western US, etc.
- Hierarchical structure/knowledge graph, i.e., Advertiser -> Campaign Groups -> Campaigns -> Ads
- Fast-changing ad inventory, i.e., old ad campaigns expire and new ad campaigns start in a fast pace[8]
- A very high throughput read-time ad auction, bid*pCTR
- Training, inference and model architecture for strict latency constraints
Design Overview
Problem Definition/Formulation
Metric Measurement
- Offline
- Recall
- AUC
- Normalized Cross Entropy (NCE)[3]
- CTR calibration
- nDCG
- Online
- CTR
- Revenue
- nDCG
- Offline
Data Collection
- Positive Samples
- Negative Samples
Feature Enginneering
- Feature examples:
- Content-wise ones: NLP, CV singals w.r.t various formats, e.g., the Ad text descriptions, images, videos and short-form videos, etc.
- Domain-based ones: App installs, purchases and sign-ups, etc.
- User-side ones: demographic features, e.g., gender, age, location, languages; interests, historical engaged Ads/Feed/Story/SFV and search history, etc.
- Context-wise ones: Ad id, time of day, day of week, the device, the ad placement information, how and where the ad is shown[4]
- Feature types: Continuous, OneHot, Indexed(id features, e.g., ad id, video id, etc), Hash-OneHot(one-hot data with unbounded vocabulary size), Hash-Indexed(indexed data with unbounded vocabulary size)[7], dense vector
- Feature examples:
Model Training
- Retrieval models
- To fetch the most relevant ad candidates, and to reduce the cardinality of candidates, e.g., from millions to a few thousands or hundreds
- Search Advertising: the user specifies a query and the query is the necessary and important context to match ads, e.g., Google search ads[2]
- Display Advertising: the user browses a publisher’s webpage and the user attribute information plays a more important role since the user intent is typically weak, e.g., FB/Tiktok/Snapchat/Pinterest/LinkedIn ads[4]
- To fetch the most relevant ad candidates, and to reduce the cardinality of candidates, e.g., from millions to a few thousands or hundreds
- Ranking models
- Main task: predicting the CTR and estimating organic utility[5] for the retrieved candidate Ads
- Auxiliary tasks: predicting the likelihood of app installs, purchases and sign-ups
- point-wise, pair-wise and list-wise ranking
- Model Types:
- Logistic Regression, GBDT, Multi-Task DNN, Deep&Cross
- Ad auction models
- To optimize the final positions by various business rules, e.g., bids for the ads, remaining budgets for the ads.
- Retrieval models
Model Evaluation/Serving
- For CTR prediction, NCE measures the goodness of the predictions and implicitly reflects calibration.
- Online A/B Test
Model Performance Monitoring
Challenges
- Training, inference and model architecture for strict latency constraints.
- Selection bias due to ads conversion logs
- CTR Calibrated predictions[6]
- Calibration is the ratio of the average estimated CTR and empirical CTR, the less the calibration differs from 1, the better the model is.
- Delayed and repeated conversions, e.g., purchase event could take a few days/weeks after the ad is shown to the user. It implies that high-quality training data is available only after a long delay.
- Fresher training data leads to more accurate predictions.
- On the training data side, it calls for a properly tuned(the length of the waiting time window) online streming data joiner[3].
- Imbalanced positive/negative samples
- Uniform subsampling
- Negative downsampling
- Note that the model needs to calibrate the prediction in the downsampling space
- 1.Paper 2007 Internet Advertising and the Generalized Second-Price Auction ↩
- 2.Paper 2013 Google Ad Click Prediction a View from the Trenches ↩
- 3.Paper 2014 Facebook Practical Lessons from Predicting Clicks on Ads at Facebook ↩
- 4.Paper 2014 LinkedIn LASER: A Scalable Response Prediction Platform For Online Advertising ↩
- 5.Paper 2020 LinkedIn Ads Allocation in Feed via Constrained Optimization ↩
- 6.Paper 2017 On calibration of modern neural networks ↩
- 7.Pinterest How We Use AutoML, Multi-Task Learning and Multi-tower Models for Pinterest Ads ↩
- 8.Snap Machine Learning for Snapchat Ad Ranking ↩
- 9.Meta Audience Ad Targeting ↩