A Bayesian Personalised Ranking Example: Factor Models for Recommending Given Names

Immanuel Bayer and Steffen Rendle

ECML PKDD Discovery Challenge 2013 (offline track).

This paper provides an interesting example of using factorisation machines for implicit feedback via the Bayesian personalised ranking (BPR) optimisation criterion.

The challenge was to recommend first names (e.g. to soon-to-be parents). Participants were provided with the browsing history of users on the name selection website Nameling. So the items to be recommended are names.

Users are considered together with their browsing history, consisting of the name they looked at last and the list of all the names they looked at up to the time t in question. The combination of a user (in this complete sense) at time t and a candidate name is vectorised as follows:

Screen Shot 2015-08-26 at 13.23.09

The (order 2) factorisation machine assigns a score to the combination as follows:

Screen Shot 2015-08-26 at 13.24.43

where p is the rank of this vectorisation, w_0 \in \mathbb{R}, w \in \mathbb{R}^p and V \in \mathbb{R}^{p \times k} are model parameters to be learned. These parameters are learned via stochastic gradient ascent of the following pair-wise learning objective:

Screen Shot 2015-08-26 at 13.24.58

where D is the set of (user u, time t, name n) tuples where u has browsed n before time t, N is the set of all possible names and \sigma is the sigmoid function. Only a single name is chosen from N \setminus \{ n \} for each update. These negative samples are drawn according to their estimated rank (this part is quite difficult to do efficiently).

In the above, we have purposefully omitted the “prefix smoothing” step, since we are mainly interested in the a simple factorisation machine example. The details are in the paper.

This recommender did very well in the challenge. However, I don’t find the examples given in the paper very impressive (though I have not seen the examples given by others). My suspicion is that the FM approach is very strong, but that there is no good way to make first name recommendations using the provided dataset. A more effective dataset would be e.g. first name x product interaction on a large e-commerce site. This would do a much better job of capturing the social meaning of names, but could go out of date very quickly.

Leave a Reply

Your email address will not be published. Required fields are marked *