Here we derive updates rules for the approximation of a row stochastic matrix by the product of two lower-rank row stochastic matrices using gradient descent. Such a factorisation corresponds to a decomposition
$$ p(n|m) = \sum_k p(n|k) \cdot p(k|m) $$
Both the sum of squares and row-wise cross-entropy functions are considered.