参考这一页:
https://github.com/PaddlePaddle/book/tree/develop/05.recommender_system
Some well know approaches include:
- User behavior-based approach. A well-known method is collaborative filtering. The underlying assumption is that if a person A has the same opinion as a person B on an issue, A is more likely to have B‘s opinion on a different issue than that of a randomly chosen person.
- Content-based recommendation[1]. This approach infers feature vectors that represent products from their descriptions. It also infers feature vectors that represent users‘ interests. Then it measures the relevance of users and products by some distances between these feature vectors.
- Hybrid approach[2]: This approach uses the content-based information to help address the cold start problem[6] in behavior-based approach.
Among these options, collaborative filtering might be the most studied one. Some of its variants include user-based[3], item-based [4], social network based[5], and model-based.
We use the MovieLens ml-1m to train our model. This dataset includes 10,000 ratings of 4,000 movies from 6,000 users to 4,000 movies. Each rate is in the range of 1~5. Thanks to GroupLens Research for collecting, processing and publishing the dataset.
paddle.v2.datasets
package encapsulates multiple public datasets, including cifar
, imdb
, mnist
, moivelens
and wmt14
, etc. There‘s no need for us to manually download and preprocess MovieLens
dataset.
时间: 2024-10-24 05:33:42