使用Turicreate进行音乐推荐#
!pip install turicreate
Collecting turicreate
Downloading turicreate-6.3-cp37-cp37m-macosx_10_12_intel.macosx_10_12_x86_64.macosx_10_13_intel.macosx_10_13_x86_64.macosx_10_14_intel.macosx_10_14_x86_64.whl (33.1 MB)
|████████████████████████████████| 33.1 MB 13 kB/s eta 0:00:0197 |████████▍ | 8.6 MB 19 kB/s eta 0:20:35 |███████████████████▌ | 20.2 MB 24 kB/s eta 0:08:42 |███████████████████▊ | 20.4 MB 12 kB/s eta 0:16:49 |██████████████████████████████▉ | 31.8 MB 39 kB/s eta 0:00:32
?25hCollecting coremltools==3.3
Downloading coremltools-3.3-cp37-none-macosx_10_14_intel.whl (3.5 MB)
|████████████████████████████████| 3.5 MB 18 kB/s eta 0:00:0115 |███████▋ | 829 kB 12 kB/s eta 0:03:30
?25hRequirement already satisfied: requests>=2.9.1 in /opt/anaconda3/lib/python3.7/site-packages (from turicreate) (2.22.0)
Requirement already satisfied: pandas>=0.23.2 in /opt/anaconda3/lib/python3.7/site-packages (from turicreate) (1.0.1)
Requirement already satisfied: numpy in /opt/anaconda3/lib/python3.7/site-packages (from turicreate) (1.18.1)
Requirement already satisfied: decorator>=4.0.9 in /opt/anaconda3/lib/python3.7/site-packages (from turicreate) (4.4.1)
Requirement already satisfied: pillow>=5.2.0 in /opt/anaconda3/lib/python3.7/site-packages (from turicreate) (7.0.0)
Requirement already satisfied: prettytable==0.7.2 in /opt/anaconda3/lib/python3.7/site-packages (from turicreate) (0.7.2)
Requirement already satisfied: six>=1.10.0 in /opt/anaconda3/lib/python3.7/site-packages (from turicreate) (1.14.0)
Collecting resampy==0.2.1
Using cached resampy-0.2.1.tar.gz (322 kB)
Collecting tensorflow>=2.0.0
Downloading tensorflow-2.2.0-cp37-cp37m-macosx_10_11_x86_64.whl (175.3 MB)
|████████████████████████████████| 175.3 MB 8.8 kB/s ta 0:00:018 |██▏ | 12.0 MB 62 kB/s eta 0:43:16 |████████▎ | 45.2 MB 21 kB/s eta 1:39:12 |████████▎ | 45.5 MB 17 kB/s eta 2:07:02 |█████████ | 48.7 MB 36 kB/s eta 0:58:32 |█████████▍ | 51.2 MB 40 kB/s eta 0:50:59 |██████████▍ | 56.8 MB 14 kB/s eta 2:11:51 |██████████████▊ | 81.0 MB 30 kB/s eta 0:51:49 |███████████████▉ | 86.8 MB 36 kB/s eta 0:40:44 |████████████████▍ | 90.1 MB 34 kB/s eta 0:40:50 |█████████████████ | 92.9 MB 35 kB/s eta 0:38:29 |██████████████████ | 99.0 MB 60 kB/s eta 0:20:57 |██████████████████▋ | 102.1 MB 76 kB/s eta 0:16:01 |██████████████████▉ | 103.4 MB 35 kB/s eta 0:33:33 |████████████████████▏ | 110.3 MB 23 kB/s eta 0:46:13 |████████████████████▏ | 110.5 MB 23 kB/s eta 0:45:17 |████████████████████▍ | 111.7 MB 53 kB/s eta 0:19:58 |██████████████████████▍ | 122.5 MB 91 kB/s eta 0:09:39 |███████████████████████ | 125.9 MB 119 kB/s eta 0:06:54 |██████████████████████████▋ | 145.9 MB 91 kB/s eta 0:05:22 |███████████████████████████▍ | 149.7 MB 22 kB/s eta 0:19:01 |████████████████████████████████| 175.3 MB 54 kB/s eta 0:00:01
?25hRequirement already satisfied: scipy>=1.1.0 in /opt/anaconda3/lib/python3.7/site-packages (from turicreate) (1.4.1)
Collecting protobuf>=3.1.0
Downloading protobuf-3.12.2-cp37-cp37m-macosx_10_9_x86_64.whl (1.3 MB)
|████████████████████████████████| 1.3 MB 38 kB/s eta 0:00:01
?25hRequirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/anaconda3/lib/python3.7/site-packages (from requests>=2.9.1->turicreate) (3.0.4)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/anaconda3/lib/python3.7/site-packages (from requests>=2.9.1->turicreate) (1.25.8)
Requirement already satisfied: idna<2.9,>=2.5 in /opt/anaconda3/lib/python3.7/site-packages (from requests>=2.9.1->turicreate) (2.8)
Requirement already satisfied: certifi>=2017.4.17 in /opt/anaconda3/lib/python3.7/site-packages (from requests>=2.9.1->turicreate) (2019.11.28)
Requirement already satisfied: pytz>=2017.2 in /opt/anaconda3/lib/python3.7/site-packages (from pandas>=0.23.2->turicreate) (2019.3)
Requirement already satisfied: python-dateutil>=2.6.1 in /opt/anaconda3/lib/python3.7/site-packages (from pandas>=0.23.2->turicreate) (2.8.1)
Requirement already satisfied: numba>=0.32 in /opt/anaconda3/lib/python3.7/site-packages (from resampy==0.2.1->turicreate) (0.48.0)
Collecting termcolor>=1.1.0
Downloading termcolor-1.1.0.tar.gz (3.9 kB)
Collecting tensorflow-estimator<2.3.0,>=2.2.0
Downloading tensorflow_estimator-2.2.0-py2.py3-none-any.whl (454 kB)
|████████████████████████████████| 454 kB 49 kB/s eta 0:00:01
?25hRequirement already satisfied: wheel>=0.26; python_version >= "3" in /opt/anaconda3/lib/python3.7/site-packages (from tensorflow>=2.0.0->turicreate) (0.34.2)
Requirement already satisfied: wrapt>=1.11.1 in /opt/anaconda3/lib/python3.7/site-packages (from tensorflow>=2.0.0->turicreate) (1.11.2)
Collecting keras-preprocessing>=1.1.0
Downloading Keras_Preprocessing-1.1.2-py2.py3-none-any.whl (42 kB)
|████████████████████████████████| 42 kB 67 kB/s eta 0:00:01
?25hCollecting google-pasta>=0.1.8
Downloading google_pasta-0.2.0-py3-none-any.whl (57 kB)
|████████████████████████████████| 57 kB 53 kB/s eta 0:00:01
?25hCollecting astunparse==1.6.3
Downloading astunparse-1.6.3-py2.py3-none-any.whl (12 kB)
Collecting grpcio>=1.8.6
Downloading grpcio-1.29.0-cp37-cp37m-macosx_10_9_x86_64.whl (2.8 MB)
|████████████████████████████████| 2.8 MB 95 kB/s eta 0:00:011
?25hCollecting gast==0.3.3
Downloading gast-0.3.3-py2.py3-none-any.whl (9.7 kB)
Collecting opt-einsum>=2.3.2
Downloading opt_einsum-3.2.1-py3-none-any.whl (63 kB)
|████████████████████████████████| 63 kB 71 kB/s eta 0:00:011
?25hCollecting tensorboard<2.3.0,>=2.2.0
Downloading tensorboard-2.2.2-py3-none-any.whl (3.0 MB)
|████████████████████████████████| 3.0 MB 31 kB/s eta 0:00:01
?25hRequirement already satisfied: h5py<2.11.0,>=2.10.0 in /opt/anaconda3/lib/python3.7/site-packages (from tensorflow>=2.0.0->turicreate) (2.10.0)
Collecting absl-py>=0.7.0
Downloading absl-py-0.9.0.tar.gz (104 kB)
|████████████████████████████████| 104 kB 23 kB/s eta 0:00:01
?25hRequirement already satisfied: setuptools in /opt/anaconda3/lib/python3.7/site-packages (from protobuf>=3.1.0->coremltools==3.3->turicreate) (46.0.0.post20200309)
Requirement already satisfied: llvmlite<0.32.0,>=0.31.0dev0 in /opt/anaconda3/lib/python3.7/site-packages (from numba>=0.32->resampy==0.2.1->turicreate) (0.31.0)
Collecting tensorboard-plugin-wit>=1.6.0
Downloading tensorboard_plugin_wit-1.6.0.post3-py3-none-any.whl (777 kB)
|████████████████████████████████| 777 kB 40 kB/s eta 0:00:01
?25hCollecting google-auth-oauthlib<0.5,>=0.4.1
Downloading google_auth_oauthlib-0.4.1-py2.py3-none-any.whl (18 kB)
Collecting google-auth<2,>=1.6.3
Downloading google_auth-1.17.2-py2.py3-none-any.whl (90 kB)
|████████████████████████████████| 90 kB 65 kB/s eta 0:00:01
?25hRequirement already satisfied: werkzeug>=0.11.15 in /opt/anaconda3/lib/python3.7/site-packages (from tensorboard<2.3.0,>=2.2.0->tensorflow>=2.0.0->turicreate) (1.0.0)
Collecting markdown>=2.6.8
Downloading Markdown-3.2.2-py3-none-any.whl (88 kB)
|████████████████████████████████| 88 kB 49 kB/s eta 0:00:01
?25hCollecting requests-oauthlib>=0.7.0
Downloading requests_oauthlib-1.3.0-py2.py3-none-any.whl (23 kB)
Collecting pyasn1-modules>=0.2.1
Downloading pyasn1_modules-0.2.8-py2.py3-none-any.whl (155 kB)
|████████████████████████████████| 155 kB 38 kB/s eta 0:00:01
?25hCollecting rsa<5,>=3.1.4; python_version >= "3"
Downloading rsa-4.6-py2.py3-none-any.whl (34 kB)
Requirement already satisfied: cachetools<5.0,>=2.0.0 in /opt/anaconda3/lib/python3.7/site-packages (from google-auth<2,>=1.6.3->tensorboard<2.3.0,>=2.2.0->tensorflow>=2.0.0->turicreate) (3.1.1)
Requirement already satisfied: importlib-metadata; python_version < "3.8" in /opt/anaconda3/lib/python3.7/site-packages (from markdown>=2.6.8->tensorboard<2.3.0,>=2.2.0->tensorflow>=2.0.0->turicreate) (1.5.0)
Collecting oauthlib>=3.0.0
Downloading oauthlib-3.1.0-py2.py3-none-any.whl (147 kB)
|████████████████████████████████| 147 kB 37 kB/s eta 0:00:01 |███████████████▋ | 71 kB 37 kB/s eta 0:00:03
?25hCollecting pyasn1<0.5.0,>=0.4.6
Downloading pyasn1-0.4.8-py2.py3-none-any.whl (77 kB)
|████████████████████████████████| 77 kB 59 kB/s eta 0:00:01
?25hRequirement already satisfied: zipp>=0.5 in /opt/anaconda3/lib/python3.7/site-packages (from importlib-metadata; python_version < "3.8"->markdown>=2.6.8->tensorboard<2.3.0,>=2.2.0->tensorflow>=2.0.0->turicreate) (2.2.0)
Building wheels for collected packages: resampy, termcolor, absl-py
Building wheel for resampy (setup.py) ... ?25ldone
?25h Created wheel for resampy: filename=resampy-0.2.1-py3-none-any.whl size=320848 sha256=13ea513477f71d513b03a44efe8db091cffc1adaab907123f0fa8d5babfdbeaf
Stored in directory: /Users/datalab/Library/Caches/pip/wheels/71/74/53/d5ceb7c5ee7a168c7d106041863e71ac3273f4a4677743a284
Building wheel for termcolor (setup.py) ... ?25ldone
?25h Created wheel for termcolor: filename=termcolor-1.1.0-py3-none-any.whl size=4830 sha256=f6234bda25caf8e0d32efe0478ea409a20565a1652f05830529936dabbebc345
Stored in directory: /Users/datalab/Library/Caches/pip/wheels/3f/e3/ec/8a8336ff196023622fbcb36de0c5a5c218cbb24111d1d4c7f2
Building wheel for absl-py (setup.py) ... ?25ldone
?25h Created wheel for absl-py: filename=absl_py-0.9.0-py3-none-any.whl size=121931 sha256=a0b4551addb6f776d7d3404e842ec5243b8894947bba6be78bc66c56963d37b5
Stored in directory: /Users/datalab/Library/Caches/pip/wheels/cc/af/1a/498a24d0730ef484019e007bb9e8cef3ac00311a672c049a3e
Successfully built resampy termcolor absl-py
Installing collected packages: protobuf, coremltools, resampy, termcolor, tensorflow-estimator, keras-preprocessing, google-pasta, astunparse, grpcio, gast, opt-einsum, tensorboard-plugin-wit, pyasn1, pyasn1-modules, rsa, google-auth, oauthlib, requests-oauthlib, google-auth-oauthlib, absl-py, markdown, tensorboard, tensorflow, turicreate
Successfully installed absl-py-0.9.0 astunparse-1.6.3 coremltools-3.3 gast-0.3.3 google-auth-1.17.2 google-auth-oauthlib-0.4.1 google-pasta-0.2.0 grpcio-1.29.0 keras-preprocessing-1.1.2 markdown-3.2.2 oauthlib-3.1.0 opt-einsum-3.2.1 protobuf-3.12.2 pyasn1-0.4.8 pyasn1-modules-0.2.8 requests-oauthlib-1.3.0 resampy-0.2.1 rsa-4.6 tensorboard-2.2.2 tensorboard-plugin-wit-1.6.0.post3 tensorflow-2.2.0 tensorflow-estimator-2.2.0 termcolor-1.1.0 turicreate-6.3
import turicreate as tc
下载数据 http://s3.amazonaws.com/dato-datasets/millionsong/10000.txt
#train_file = 'http://s3.amazonaws.com/dato-datasets/millionsong/10000.txt'
train_file = '/Users/datalab/bigdata/cjc/millionsong/song_usage_10000.txt'
sf = tc.SFrame.read_csv(train_file, header=False, delimiter='\t', verbose=False)
sf = sf.rename({'X1':'user_id', 'X2':'music_id', 'X3':'rating'})
train_set, test_set = sf.random_split(0.8, seed=1)
popularity_model = tc.popularity_recommender.create(train_set,
'user_id', 'music_id',
target = 'rating')
Preparing data set.
Data has 1599753 observations with 76085 users and 10000 items.
Data prepared in: 4.15079s
1599753 observations to process; with 10000 unique items.
item_sim_model = tc.item_similarity_recommender.create(train_set,
'user_id', 'music_id',
target = 'rating',
similarity_type='cosine')
Preparing data set.
Data has 1599753 observations with 76085 users and 10000 items.
Data prepared in: 3.7942s
Training model from provided data.
Gathering per-item and per-user statistics.
+--------------------------------+------------+
| Elapsed Time (Item Statistics) | % Complete |
+--------------------------------+------------+
| 7.569ms | 2.5 |
| 90.88ms | 100 |
+--------------------------------+------------+
Setting up lookup tables.
Processing data in one pass using dense lookup tables.
+-------------------------------------+------------------+-----------------+
| Elapsed Time (Constructing Lookups) | Total % Complete | Items Processed |
+-------------------------------------+------------------+-----------------+
| 605.016ms | 0 | 0 |
| 4.01s | 100 | 10000 |
+-------------------------------------+------------------+-----------------+
Finalizing lookup tables.
Generating candidate set for working with new users.
Finished training in 5.31028s
factorization_machine_model = tc.recommender.factorization_recommender.create(train_set,
'user_id', 'music_id',
target='rating')
Preparing data set.
Data has 1599753 observations with 76085 users and 10000 items.
Data prepared in: 4.32575s
Training factorization_recommender for recommendations.
+--------------------------------+--------------------------------------------------+----------+
| Parameter | Description | Value |
+--------------------------------+--------------------------------------------------+----------+
| num_factors | Factor Dimension | 8 |
| regularization | L2 Regularization on Factors | 1e-08 |
| solver | Solver used for training | sgd |
| linear_regularization | L2 Regularization on Linear Coefficients | 1e-10 |
| max_iterations | Maximum Number of Iterations | 50 |
+--------------------------------+--------------------------------------------------+----------+
Optimizing model using SGD; tuning step size.
Using 199969 / 1599753 points for tuning the step size.
+---------+-------------------+------------------------------------------+
| Attempt | Initial Step Size | Estimated Objective Value |
+---------+-------------------+------------------------------------------+
| 0 | 25 | No Decrease (230.933 >= 43.5401) |
| 1 | 6.25 | No Decrease (219.447 >= 43.5401) |
| 2 | 1.5625 | No Decrease (191.895 >= 43.5401) |
| 3 | 0.390625 | No Decrease (89.356 >= 43.5401) |
| 4 | 0.0976562 | 16.0024 |
| 5 | 0.0488281 | 11.4371 |
| 6 | 0.0244141 | 24.5498 |
+---------+-------------------+------------------------------------------+
| Final | 0.0488281 | 11.4371 |
+---------+-------------------+------------------------------------------+
Starting Optimization.
+---------+--------------+-------------------+-----------------------+-------------+
| Iter. | Elapsed Time | Approx. Objective | Approx. Training RMSE | Step Size |
+---------+--------------+-------------------+-----------------------+-------------+
| Initial | 387us | 43.795 | 6.61778 | |
+---------+--------------+-------------------+-----------------------+-------------+
| 1 | 681.41ms | 43.5465 | 6.59858 | 0.0488281 |
| 2 | 1.25s | 40.8911 | 6.39426 | 0.0290334 |
| 3 | 1.97s | 37.9926 | 6.16345 | 0.0214205 |
| 4 | 2.74s | 35.4229 | 5.95132 | 0.0172633 |
| 5 | 3.26s | 32.7792 | 5.72487 | 0.014603 |
| 10 | 5.91s | 24.5046 | 4.94956 | 0.008683 |
| 15 | 9.07s | 20.0943 | 4.48185 | 0.00640622 |
| 20 | 11.83s | 17.639 | 4.19895 | 0.00516295 |
| 25 | 14.27s | 15.7055 | 3.96197 | 0.00436732 |
| 30 | 16.57s | 14.3953 | 3.79299 | 0.00380916 |
| 35 | 18.71s | 13.3639 | 3.65445 | 0.00339327 |
| 40 | 20.92s | 12.5027 | 3.53463 | 0.00306991 |
| 45 | 23.89s | 11.8108 | 3.43534 | 0.00281035 |
| 50 | 26.34s | 9.85419 | 3.13763 | 0.00154408 |
+---------+--------------+-------------------+-----------------------+-------------+
Optimization Complete: Maximum number of passes through the data reached.
Computing final objective value and training RMSE.
Final objective value: 8.8282
Final training RMSE: 2.96963
len(train_set)
1599753
result = tc.recommender.util.compare_models(test_set,
[popularity_model, item_sim_model, factorization_machine_model],
user_sample=.5, skip_set=train_set)
compare_models: using 34354 users to estimate model performance
PROGRESS: Evaluate model M0
recommendations finished on 1000/34354 queries. users per second: 5393.6
recommendations finished on 2000/34354 queries. users per second: 5901.05
recommendations finished on 3000/34354 queries. users per second: 5891.65
recommendations finished on 4000/34354 queries. users per second: 5752.93
recommendations finished on 5000/34354 queries. users per second: 5841.69
recommendations finished on 6000/34354 queries. users per second: 5762.33
recommendations finished on 7000/34354 queries. users per second: 5834.76
recommendations finished on 8000/34354 queries. users per second: 5904.72
recommendations finished on 9000/34354 queries. users per second: 5766.33
recommendations finished on 10000/34354 queries. users per second: 5748.05
recommendations finished on 11000/34354 queries. users per second: 5619.56
recommendations finished on 12000/34354 queries. users per second: 5600.83
recommendations finished on 13000/34354 queries. users per second: 5659.63
recommendations finished on 14000/34354 queries. users per second: 5537.91
recommendations finished on 15000/34354 queries. users per second: 5566.55
recommendations finished on 16000/34354 queries. users per second: 5566.55
recommendations finished on 17000/34354 queries. users per second: 5541.39
recommendations finished on 18000/34354 queries. users per second: 5537.43
recommendations finished on 19000/34354 queries. users per second: 5494.31
recommendations finished on 20000/34354 queries. users per second: 5540.8
recommendations finished on 21000/34354 queries. users per second: 5567.68
recommendations finished on 22000/34354 queries. users per second: 5596
recommendations finished on 23000/34354 queries. users per second: 5594.48
recommendations finished on 24000/34354 queries. users per second: 5551.09
recommendations finished on 25000/34354 queries. users per second: 5561.67
recommendations finished on 26000/34354 queries. users per second: 5526.95
recommendations finished on 27000/34354 queries. users per second: 5465.2
recommendations finished on 28000/34354 queries. users per second: 5437.18
recommendations finished on 29000/34354 queries. users per second: 5444.31
recommendations finished on 30000/34354 queries. users per second: 5452.98
recommendations finished on 31000/34354 queries. users per second: 5430.37
recommendations finished on 32000/34354 queries. users per second: 5407.57
recommendations finished on 33000/34354 queries. users per second: 5386.39
recommendations finished on 34000/34354 queries. users per second: 5405.75
Precision and recall summary statistics by cutoff
+--------+------------------------+------------------------+
| cutoff | mean_precision | mean_recall |
+--------+------------------------+------------------------+
| 1 | 0.00040752168597543237 | 7.081875226074056e-05 |
| 2 | 0.0004075216859754322 | 0.00011226035987334286 |
| 3 | 0.0003104927131241391 | 0.00012739048212345128 |
| 4 | 0.00034202712930080884 | 0.0002353030396709293 |
| 5 | 0.00046573906968620603 | 0.00046444872494981606 |
| 6 | 0.0004414818264733844 | 0.0005228618282305431 |
| 7 | 0.00043247199328005 | 0.0005736297245772213 |
| 8 | 0.0004075216859754325 | 0.0006087229198555468 |
| 9 | 0.0004528018733060353 | 0.0007391431080220185 |
| 10 | 0.00043080863945974164 | 0.0007735092654785608 |
+--------+------------------------+------------------------+
[10 rows x 3 columns]
Overall RMSE: 5.9110406201585715
Per User RMSE (best)
+-------------------------------+------+-------+
| user_id | rmse | count |
+-------------------------------+------+-------+
| cafbf96566378466408b7b3c76... | 0.0 | 1 |
+-------------------------------+------+-------+
[1 rows x 3 columns]
Per User RMSE (worst)
+-------------------------------+-------------------+-------+
| user_id | rmse | count |
+-------------------------------+-------------------+-------+
| 38767872c514c1b43bab5c7b21... | 341.2071760874715 | 2 |
+-------------------------------+-------------------+-------+
[1 rows x 3 columns]
Per Item RMSE (best)
+--------------------+---------------------+-------+
| music_id | rmse | count |
+--------------------+---------------------+-------+
| SOXDPFW12A81C2319B | 0.07352941176470584 | 5 |
+--------------------+---------------------+-------+
[1 rows x 3 columns]
Per Item RMSE (worst)
+--------------------+--------------------+-------+
| music_id | rmse | count |
+--------------------+--------------------+-------+
| SOPKTFQ12A67021600 | 124.75180499529567 | 9 |
+--------------------+--------------------+-------+
[1 rows x 3 columns]
PROGRESS: Evaluate model M1
recommendations finished on 1000/34354 queries. users per second: 5825.84
recommendations finished on 2000/34354 queries. users per second: 5301.82
recommendations finished on 3000/34354 queries. users per second: 5419.21
recommendations finished on 4000/34354 queries. users per second: 5627.83
recommendations finished on 5000/34354 queries. users per second: 5771.93
recommendations finished on 6000/34354 queries. users per second: 5585.3
recommendations finished on 7000/34354 queries. users per second: 5365.4
recommendations finished on 8000/34354 queries. users per second: 5147.5
recommendations finished on 9000/34354 queries. users per second: 5252.85
recommendations finished on 10000/34354 queries. users per second: 5301.11
recommendations finished on 11000/34354 queries. users per second: 5257.75
recommendations finished on 12000/34354 queries. users per second: 5181.17
recommendations finished on 13000/34354 queries. users per second: 5139.25
recommendations finished on 14000/34354 queries. users per second: 5155.66
recommendations finished on 15000/34354 queries. users per second: 4967.57
recommendations finished on 16000/34354 queries. users per second: 4921.83
recommendations finished on 17000/34354 queries. users per second: 4990.4
recommendations finished on 18000/34354 queries. users per second: 5068.35
recommendations finished on 19000/34354 queries. users per second: 5140.77
recommendations finished on 20000/34354 queries. users per second: 5213.8
recommendations finished on 21000/34354 queries. users per second: 5184.08
recommendations finished on 22000/34354 queries. users per second: 5094.51
recommendations finished on 23000/34354 queries. users per second: 5146.65
recommendations finished on 24000/34354 queries. users per second: 5170.89
recommendations finished on 25000/34354 queries. users per second: 5200.77
recommendations finished on 26000/34354 queries. users per second: 5183.28
recommendations finished on 27000/34354 queries. users per second: 5222.37
recommendations finished on 28000/34354 queries. users per second: 5251.88
recommendations finished on 29000/34354 queries. users per second: 5272.72
recommendations finished on 30000/34354 queries. users per second: 5288.9
recommendations finished on 31000/34354 queries. users per second: 5280.69
recommendations finished on 32000/34354 queries. users per second: 5272.44
recommendations finished on 33000/34354 queries. users per second: 5320.29
recommendations finished on 34000/34354 queries. users per second: 5353.61
Precision and recall summary statistics by cutoff
+--------+----------------------+----------------------+
| cutoff | mean_precision | mean_recall |
+--------+----------------------+----------------------+
| 1 | 0.12356639692612195 | 0.027394188372094497 |
| 2 | 0.10617395354252777 | 0.0450683741242837 |
| 3 | 0.09521453105897464 | 0.059418188595173727 |
| 4 | 0.08673662455609242 | 0.07079052813704197 |
| 5 | 0.08053792862548775 | 0.08078157252034639 |
| 6 | 0.07570685606722138 | 0.09033226689608059 |
| 7 | 0.07142441304402039 | 0.09836168110578075 |
| 8 | 0.06780869767712655 | 0.10631301678527204 |
| 9 | 0.06465687320900597 | 0.1134367927631828 |
| 10 | 0.061861791931070574 | 0.11986145197856829 |
+--------+----------------------+----------------------+
[10 rows x 3 columns]
Overall RMSE: 6.6935880472475
Per User RMSE (best)
+-------------------------------+--------------------+-------+
| user_id | rmse | count |
+-------------------------------+--------------------+-------+
| f015c8ec1487d172a76e8af6fd... | 0.0991658329963685 | 1 |
+-------------------------------+--------------------+-------+
[1 rows x 3 columns]
Per User RMSE (worst)
+-------------------------------+-------------------+-------+
| user_id | rmse | count |
+-------------------------------+-------------------+-------+
| 38767872c514c1b43bab5c7b21... | 346.8117700274172 | 2 |
+-------------------------------+-------------------+-------+
[1 rows x 3 columns]
Per Item RMSE (best)
+--------------------+--------------------+-------+
| music_id | rmse | count |
+--------------------+--------------------+-------+
| SOAVQRP12A8C13120B | 0.8293558937901314 | 4 |
+--------------------+--------------------+-------+
[1 rows x 3 columns]
Per Item RMSE (worst)
+--------------------+-------------------+-------+
| music_id | rmse | count |
+--------------------+-------------------+-------+
| SOPKTFQ12A67021600 | 128.8226138396836 | 9 |
+--------------------+-------------------+-------+
[1 rows x 3 columns]
PROGRESS: Evaluate model M2
recommendations finished on 1000/34354 queries. users per second: 5658.25
recommendations finished on 2000/34354 queries. users per second: 6360.82
recommendations finished on 3000/34354 queries. users per second: 6111.59
recommendations finished on 4000/34354 queries. users per second: 5757.38
recommendations finished on 5000/34354 queries. users per second: 5756.7
recommendations finished on 6000/34354 queries. users per second: 5927.31
recommendations finished on 7000/34354 queries. users per second: 6043.27
recommendations finished on 8000/34354 queries. users per second: 6053.29
recommendations finished on 9000/34354 queries. users per second: 6008.53
recommendations finished on 10000/34354 queries. users per second: 6029.76
recommendations finished on 11000/34354 queries. users per second: 5973.13
recommendations finished on 12000/34354 queries. users per second: 5982.24
recommendations finished on 13000/34354 queries. users per second: 5945.1
recommendations finished on 14000/34354 queries. users per second: 5940.6
recommendations finished on 15000/34354 queries. users per second: 5976.12
recommendations finished on 16000/34354 queries. users per second: 5915.28
recommendations finished on 17000/34354 queries. users per second: 5818.41
recommendations finished on 18000/34354 queries. users per second: 5838.5
recommendations finished on 19000/34354 queries. users per second: 5876.56
recommendations finished on 20000/34354 queries. users per second: 5943.82
recommendations finished on 21000/34354 queries. users per second: 5952.95
recommendations finished on 22000/34354 queries. users per second: 5934.66
recommendations finished on 23000/34354 queries. users per second: 5937.77
recommendations finished on 24000/34354 queries. users per second: 5984.55
recommendations finished on 25000/34354 queries. users per second: 5983.3
recommendations finished on 26000/34354 queries. users per second: 5979.39
recommendations finished on 27000/34354 queries. users per second: 5971.13
recommendations finished on 28000/34354 queries. users per second: 6012.81
recommendations finished on 29000/34354 queries. users per second: 6001.44
recommendations finished on 30000/34354 queries. users per second: 6024.65
recommendations finished on 31000/34354 queries. users per second: 5997.94
recommendations finished on 32000/34354 queries. users per second: 5964.91
recommendations finished on 33000/34354 queries. users per second: 5996.43
recommendations finished on 34000/34354 queries. users per second: 6000.34
Precision and recall summary statistics by cutoff
+--------+-----------------------+------------------------+
| cutoff | mean_precision | mean_recall |
+--------+-----------------------+------------------------+
| 1 | 0.000611282528963149 | 0.00015967112604116908 |
| 2 | 0.0005094021074692894 | 0.00020969792246105614 |
| 3 | 0.0005142535561118539 | 0.0002897975387832757 |
| 4 | 0.0005312336263608328 | 0.0004549968288297451 |
| 5 | 0.0005414216685102181 | 0.0005703895459784241 |
| 6 | 0.000548213696609806 | 0.0007142050704628652 |
| 7 | 0.0005738570680062202 | 0.0008314573503625282 |
| 8 | 0.0005930895965535321 | 0.0009602957110903129 |
| 9 | 0.0006015796316780176 | 0.0011215295194248998 |
| 10 | 0.0005967281830354553 | 0.0012125256843862468 |
+--------+-----------------------+------------------------+
[10 rows x 3 columns]
Overall RMSE: 7.641661679566184
Per User RMSE (best)
+-------------------------------+-----------------------+-------+
| user_id | rmse | count |
+-------------------------------+-----------------------+-------+
| 220f26b368277020c4e685351a... | 5.398923054400484e-06 | 1 |
+-------------------------------+-----------------------+-------+
[1 rows x 3 columns]
Per User RMSE (worst)
+-------------------------------+-------------------+-------+
| user_id | rmse | count |
+-------------------------------+-------------------+-------+
| 38767872c514c1b43bab5c7b21... | 360.3407134339816 | 2 |
+-------------------------------+-------------------+-------+
[1 rows x 3 columns]
Per Item RMSE (best)
+--------------------+----------------------+-------+
| music_id | rmse | count |
+--------------------+----------------------+-------+
| SOAESGK12A8C138488 | 0.028188730543956098 | 1 |
+--------------------+----------------------+-------+
[1 rows x 3 columns]
Per Item RMSE (worst)
+--------------------+--------------------+-------+
| music_id | rmse | count |
+--------------------+--------------------+-------+
| SOPKTFQ12A67021600 | 126.00562214029682 | 9 |
+--------------------+--------------------+-------+
[1 rows x 3 columns]
K = 10
users = gl.SArray(sf['user_id'].unique().head(100))
recs = item_sim_model.recommend(users=users, k=K)
recs.head()
user_id | music_id | score | rank |
---|---|---|---|
279292bb36dbfc7f505e36ebf 038c81eb1d1d63e ... |
SOXUQNR12AF72A69D6 | 3.022422651449839 | 1 |
279292bb36dbfc7f505e36ebf 038c81eb1d1d63e ... |
SOUFAZA12AC3DFAB20 | 1.3368427753448486 | 2 |
279292bb36dbfc7f505e36ebf 038c81eb1d1d63e ... |
SOSFSTC12A8C141219 | 1.091982126235962 | 3 |
279292bb36dbfc7f505e36ebf 038c81eb1d1d63e ... |
SOVIWFP12A58A7D1BD | 1.045163869857788 | 4 |
279292bb36dbfc7f505e36ebf 038c81eb1d1d63e ... |
SOBMTQD12AB01833D0 | 1.0294516881306965 | 5 |
279292bb36dbfc7f505e36ebf 038c81eb1d1d63e ... |
SOCMNRG12AB0189D3F | 0.9756437937418619 | 6 |
279292bb36dbfc7f505e36ebf 038c81eb1d1d63e ... |
SOXOHUM12A67ADC826 | 0.9506873289744059 | 7 |
279292bb36dbfc7f505e36ebf 038c81eb1d1d63e ... |
SOWBFVW12A6D4F612B | 0.9092370669047037 | 8 |
279292bb36dbfc7f505e36ebf 038c81eb1d1d63e ... |
SOXFYTY127E9433E7D | 0.8977278073628744 | 9 |
279292bb36dbfc7f505e36ebf 038c81eb1d1d63e ... |
SOYBLYP12A58A79D32 | 0.8970928192138672 | 10 |