Immanuel Bayer and Steffen Rendle
ECML PKDD Discovery Challenge 2013 (offline track).
This paper provides an interesting example of using factorisation machines for implicit feedback via the Bayesian personalised ranking (BPR) optimisation criterion.
The challenge was to recommend first names (e.g. to soon-to-be parents). Participants were provided with the browsing history of users on the name selection website Nameling. So the items to be recommended are names.
Users are considered together with their browsing history, consisting of the name they looked at last and the list of all the names they looked at up to the time $t$ in question. The combination of a user (in this complete sense) at time $t$ and a candidate name is vectorised as follows:
The (order 2) factorisation machine assigns a score to the combination as follows:
where $p$ is the rank of this vectorisation, $w_0 \in \mathbb{R}$, $w \in \mathbb{R}^p$ and $V \in \mathbb{R}^{p \times k}$ are model parameters to be learned. These parameters are learned via stochastic gradient ascent of the following pair-wise learning objective:
where $D$ is the set of (user u, time t, name n) tuples where u has browsed n before time t, $N$ is the set of all possible names and $\sigma$ is the sigmoid function. Only a single name is chosen from $N \setminus \{ n \}$ for each update. These negative samples are drawn according to their estimated rank (this part is quite difficult to do efficiently).
In the above, we have purposefully omitted the “prefix smoothing” step, since we are mainly interested in the a simple factorisation machine example. The details are in the paper.
Remarks
This recommender did very well in the challenge. However, I don’t find the examples given in the paper very impressive (though I have not seen the examples given by others). My suspicion is that the FM approach is very strong, but that there is no good way to make first name recommendations using the provided dataset. A more effective dataset would be e.g. first name x product interaction on a large e-commerce site. This would do a much better job of capturing the social meaning of names, but could go out of date very quickly.