Recommendation system in PHP using Matrix Factorization

A recommendation system, is a system that can predict a rating for a user-item pair, where this user has never interacted with the item before.

In other words, is users of a user can give preferences for items/products in the sense of a star system (1-5 stars like NetFlix) or a thumbs-up/thumbs-down like YouTube, one can generate a rating matrix consisting of users and items. And each cell then has a rating value if the user has rated this particular item.

User/product matrix James Peter Anna Victoria
Blue pants 5 3 0 1
Red hat 4 0 0 1
Black shoes 1 1 0 5
Computer mouse 1 0 0 4
Yellow t-shirt 0 1 5 4

The table above, is called a ratings matrix, where a value above 0 is a rating defined by the user. A rating of 0 means the user has never encountered the product before.

We now wish to calculate all the cells with 0’s – This is where Matrix Factorization is useful.
Let’s do some maths’

We have a set of users U and a set of items D. Let R of size |U| \times |D| be the ratings matrix that the users have assigned their items. We want to discover K latent factors

We now need to find two matrices P(a|U| \times K) and Q (a|D| \times K) such that their product approximates R:

R \approx P \times Q^T = \hat{R}

So each row of P would represent the strength of the associations between a user and its factors. To get the prediction of a rating of an item d_j by u_i we can calculate the dot product of the two vectors corresponding to u_i and d_j:

\hat{r}_{ij} = p_{i}^{T}q_j = \sum\limits_{k=1}^{k} p_{ik}q_{kj}

We just need to find P and Q – There exist several ways to do this.

I’m going to use Gradient Descent – initialize the two matrices with some values, and calculate how different their product is to M and try to minimize the difference iteratively.

The squared error can be calculated by:

e_{ij}^2 = (r_{ij} - \hat{r}_{ij})^2 = (r_{ij} - \sum\limits_{k=1}^{K} p_{ik}q_{kj})^2

We want to minimize this error, and we want to know in which direction this error goes.

\frac{d}{d p_{ik}} = -2 (r_{ij} - \hat{r}*{ij} )(q_{kj} = -2 e_{ij} q_{kj})
\frac{d}{d q_{ik}} = -2 (r_{ij} - \hat{r}*{ij} )(p_{kj} = -2 e_{ij} p_{kj})

We now know in which direction to go, to minimize the error (i.e. the gradient) for both p_{ik} and q_{kj}

p'_{ik} = p_{ik} + \alpha \frac{d}{d p_{ik}} e_{ij}^2 = p_{ik} + 2 \alpha e_{ij}q_{kj}
q'_{kj} = q_{kj} + \alpha \frac{d}{d q_{kj}} e_{ij}^2 = q_{kj} + 2 \alpha e_{ij}p_{ik}

Where \alpha is a constant that determines the rate of approaching the minimum, and should be a small value like \alpha = 0.0002. If this value is too large we might step over the minimum, and maybe oscillating around the minimum.

Using the above update rules we can iteratively perform the operation until a convergence.

E = \sum\limits_{u, d_j, r_{ij} \in T} e_{ij} = \sum\limits_{u_i, d_j, r_{ij} \in T} (r_{ij} - \sum\limits_{k=1}^{K} p_{ik}q_{kj})^2

Now we have the factorization done, which is fine. – We can although run into a problem with over-fitting the model.
To avoid that, we introduce a regularization step:

e_{ij}^2 = (r_{ij}-\sum\limits_{k=1}^{K} p_{ik}q_{kj})^2 + \frac{\beta}{2} \sum\limits_{k=1}^{K}(||P||^2 + ||Q||^2)

Where \beta is used to control the magnitudes of the user-factor and item-factor vector such that P and Q would give a good approximation of R without having to contain large numbers. In practice $latex $beta$ is set to some value in the range of 0.02

So the new update rules are:

p'_{ik} = p_{ik} + \alpha \frac{d}{d p_{ik}} e_{ij}^2 = p_{ik} + \alpha (2e_{ij}q_{kj}- \beta p_{ik})
q'_{kj} = q_{kj} + \alpha \frac{d}{d q_{kj}} e_{ij}^2 = q_{kj} + \alpha (2e_{ij}p_{ik}- \beta q_{kj})

Implementation in PHP

Running the above PHP implementation you will get the output:

User/product matrix James Peter Anna Victoria
Blue pants 5.01 2.87 4.92 0.99
Red hat 3.94 2.25 4.02 0.99
Black shoes 1.10 0.73 4.63 4.95
Computer mouse 0.94 0.62 3.76 3.97
Yellow t-shirt 2.22 1.35 4.88 4.04

Where all bold numbers was 0’s in the ratings matrix.
The only thing you need to do, is to use some kind of similarity function (Tanimoto, Pearson, K-NN) to find which items to recommend.
(I have to stop this guide now – it’s already too long) – I hope you gained some knowledge from this, or at least you can use my implementation in your system.

The PHP implementation is written by me, but taken from a Python implementation from here

Skriv et svar