使用Turicreate进行音乐推荐#

!pip install turicreate
Collecting turicreate
  Downloading turicreate-6.3-cp37-cp37m-macosx_10_12_intel.macosx_10_12_x86_64.macosx_10_13_intel.macosx_10_13_x86_64.macosx_10_14_intel.macosx_10_14_x86_64.whl (33.1 MB)
     |████████████████████████████████| 33.1 MB 13 kB/s eta 0:00:0197   |████████▍                       | 8.6 MB 19 kB/s eta 0:20:35     |███████████████████▌            | 20.2 MB 24 kB/s eta 0:08:42     |███████████████████▊            | 20.4 MB 12 kB/s eta 0:16:49     |██████████████████████████████▉ | 31.8 MB 39 kB/s eta 0:00:32
?25hCollecting coremltools==3.3
  Downloading coremltools-3.3-cp37-none-macosx_10_14_intel.whl (3.5 MB)
     |████████████████████████████████| 3.5 MB 18 kB/s eta 0:00:0115   |███████▋                        | 829 kB 12 kB/s eta 0:03:30
?25hRequirement already satisfied: requests>=2.9.1 in /opt/anaconda3/lib/python3.7/site-packages (from turicreate) (2.22.0)
Requirement already satisfied: pandas>=0.23.2 in /opt/anaconda3/lib/python3.7/site-packages (from turicreate) (1.0.1)
Requirement already satisfied: numpy in /opt/anaconda3/lib/python3.7/site-packages (from turicreate) (1.18.1)
Requirement already satisfied: decorator>=4.0.9 in /opt/anaconda3/lib/python3.7/site-packages (from turicreate) (4.4.1)
Requirement already satisfied: pillow>=5.2.0 in /opt/anaconda3/lib/python3.7/site-packages (from turicreate) (7.0.0)
Requirement already satisfied: prettytable==0.7.2 in /opt/anaconda3/lib/python3.7/site-packages (from turicreate) (0.7.2)
Requirement already satisfied: six>=1.10.0 in /opt/anaconda3/lib/python3.7/site-packages (from turicreate) (1.14.0)
Collecting resampy==0.2.1
  Using cached resampy-0.2.1.tar.gz (322 kB)
Collecting tensorflow>=2.0.0
  Downloading tensorflow-2.2.0-cp37-cp37m-macosx_10_11_x86_64.whl (175.3 MB)
     |████████████████████████████████| 175.3 MB 8.8 kB/s ta 0:00:018   |██▏                             | 12.0 MB 62 kB/s eta 0:43:16     |████████▎                       | 45.2 MB 21 kB/s eta 1:39:12     |████████▎                       | 45.5 MB 17 kB/s eta 2:07:02     |█████████                       | 48.7 MB 36 kB/s eta 0:58:32     |█████████▍                      | 51.2 MB 40 kB/s eta 0:50:59     |██████████▍                     | 56.8 MB 14 kB/s eta 2:11:51     |██████████████▊                 | 81.0 MB 30 kB/s eta 0:51:49     |███████████████▉                | 86.8 MB 36 kB/s eta 0:40:44     |████████████████▍               | 90.1 MB 34 kB/s eta 0:40:50     |█████████████████               | 92.9 MB 35 kB/s eta 0:38:29     |██████████████████              | 99.0 MB 60 kB/s eta 0:20:57     |██████████████████▋             | 102.1 MB 76 kB/s eta 0:16:01     |██████████████████▉             | 103.4 MB 35 kB/s eta 0:33:33     |████████████████████▏           | 110.3 MB 23 kB/s eta 0:46:13     |████████████████████▏           | 110.5 MB 23 kB/s eta 0:45:17     |████████████████████▍           | 111.7 MB 53 kB/s eta 0:19:58     |██████████████████████▍         | 122.5 MB 91 kB/s eta 0:09:39     |███████████████████████         | 125.9 MB 119 kB/s eta 0:06:54     |██████████████████████████▋     | 145.9 MB 91 kB/s eta 0:05:22     |███████████████████████████▍    | 149.7 MB 22 kB/s eta 0:19:01     |████████████████████████████████| 175.3 MB 54 kB/s eta 0:00:01
?25hRequirement already satisfied: scipy>=1.1.0 in /opt/anaconda3/lib/python3.7/site-packages (from turicreate) (1.4.1)
Collecting protobuf>=3.1.0
  Downloading protobuf-3.12.2-cp37-cp37m-macosx_10_9_x86_64.whl (1.3 MB)
     |████████████████████████████████| 1.3 MB 38 kB/s eta 0:00:01
?25hRequirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/anaconda3/lib/python3.7/site-packages (from requests>=2.9.1->turicreate) (3.0.4)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/anaconda3/lib/python3.7/site-packages (from requests>=2.9.1->turicreate) (1.25.8)
Requirement already satisfied: idna<2.9,>=2.5 in /opt/anaconda3/lib/python3.7/site-packages (from requests>=2.9.1->turicreate) (2.8)
Requirement already satisfied: certifi>=2017.4.17 in /opt/anaconda3/lib/python3.7/site-packages (from requests>=2.9.1->turicreate) (2019.11.28)
Requirement already satisfied: pytz>=2017.2 in /opt/anaconda3/lib/python3.7/site-packages (from pandas>=0.23.2->turicreate) (2019.3)
Requirement already satisfied: python-dateutil>=2.6.1 in /opt/anaconda3/lib/python3.7/site-packages (from pandas>=0.23.2->turicreate) (2.8.1)
Requirement already satisfied: numba>=0.32 in /opt/anaconda3/lib/python3.7/site-packages (from resampy==0.2.1->turicreate) (0.48.0)
Collecting termcolor>=1.1.0
  Downloading termcolor-1.1.0.tar.gz (3.9 kB)
Collecting tensorflow-estimator<2.3.0,>=2.2.0
  Downloading tensorflow_estimator-2.2.0-py2.py3-none-any.whl (454 kB)
     |████████████████████████████████| 454 kB 49 kB/s eta 0:00:01
?25hRequirement already satisfied: wheel>=0.26; python_version >= "3" in /opt/anaconda3/lib/python3.7/site-packages (from tensorflow>=2.0.0->turicreate) (0.34.2)
Requirement already satisfied: wrapt>=1.11.1 in /opt/anaconda3/lib/python3.7/site-packages (from tensorflow>=2.0.0->turicreate) (1.11.2)
Collecting keras-preprocessing>=1.1.0
  Downloading Keras_Preprocessing-1.1.2-py2.py3-none-any.whl (42 kB)
     |████████████████████████████████| 42 kB 67 kB/s eta 0:00:01
?25hCollecting google-pasta>=0.1.8
  Downloading google_pasta-0.2.0-py3-none-any.whl (57 kB)
     |████████████████████████████████| 57 kB 53 kB/s eta 0:00:01
?25hCollecting astunparse==1.6.3
  Downloading astunparse-1.6.3-py2.py3-none-any.whl (12 kB)
Collecting grpcio>=1.8.6
  Downloading grpcio-1.29.0-cp37-cp37m-macosx_10_9_x86_64.whl (2.8 MB)
     |████████████████████████████████| 2.8 MB 95 kB/s eta 0:00:011
?25hCollecting gast==0.3.3
  Downloading gast-0.3.3-py2.py3-none-any.whl (9.7 kB)
Collecting opt-einsum>=2.3.2
  Downloading opt_einsum-3.2.1-py3-none-any.whl (63 kB)
     |████████████████████████████████| 63 kB 71 kB/s eta 0:00:011
?25hCollecting tensorboard<2.3.0,>=2.2.0
  Downloading tensorboard-2.2.2-py3-none-any.whl (3.0 MB)
     |████████████████████████████████| 3.0 MB 31 kB/s eta 0:00:01
?25hRequirement already satisfied: h5py<2.11.0,>=2.10.0 in /opt/anaconda3/lib/python3.7/site-packages (from tensorflow>=2.0.0->turicreate) (2.10.0)
Collecting absl-py>=0.7.0
  Downloading absl-py-0.9.0.tar.gz (104 kB)
     |████████████████████████████████| 104 kB 23 kB/s eta 0:00:01
?25hRequirement already satisfied: setuptools in /opt/anaconda3/lib/python3.7/site-packages (from protobuf>=3.1.0->coremltools==3.3->turicreate) (46.0.0.post20200309)
Requirement already satisfied: llvmlite<0.32.0,>=0.31.0dev0 in /opt/anaconda3/lib/python3.7/site-packages (from numba>=0.32->resampy==0.2.1->turicreate) (0.31.0)
Collecting tensorboard-plugin-wit>=1.6.0
  Downloading tensorboard_plugin_wit-1.6.0.post3-py3-none-any.whl (777 kB)
     |████████████████████████████████| 777 kB 40 kB/s eta 0:00:01
?25hCollecting google-auth-oauthlib<0.5,>=0.4.1
  Downloading google_auth_oauthlib-0.4.1-py2.py3-none-any.whl (18 kB)
Collecting google-auth<2,>=1.6.3
  Downloading google_auth-1.17.2-py2.py3-none-any.whl (90 kB)
     |████████████████████████████████| 90 kB 65 kB/s eta 0:00:01
?25hRequirement already satisfied: werkzeug>=0.11.15 in /opt/anaconda3/lib/python3.7/site-packages (from tensorboard<2.3.0,>=2.2.0->tensorflow>=2.0.0->turicreate) (1.0.0)
Collecting markdown>=2.6.8
  Downloading Markdown-3.2.2-py3-none-any.whl (88 kB)
     |████████████████████████████████| 88 kB 49 kB/s eta 0:00:01
?25hCollecting requests-oauthlib>=0.7.0
  Downloading requests_oauthlib-1.3.0-py2.py3-none-any.whl (23 kB)
Collecting pyasn1-modules>=0.2.1
  Downloading pyasn1_modules-0.2.8-py2.py3-none-any.whl (155 kB)
     |████████████████████████████████| 155 kB 38 kB/s eta 0:00:01
?25hCollecting rsa<5,>=3.1.4; python_version >= "3"
  Downloading rsa-4.6-py2.py3-none-any.whl (34 kB)
Requirement already satisfied: cachetools<5.0,>=2.0.0 in /opt/anaconda3/lib/python3.7/site-packages (from google-auth<2,>=1.6.3->tensorboard<2.3.0,>=2.2.0->tensorflow>=2.0.0->turicreate) (3.1.1)
Requirement already satisfied: importlib-metadata; python_version < "3.8" in /opt/anaconda3/lib/python3.7/site-packages (from markdown>=2.6.8->tensorboard<2.3.0,>=2.2.0->tensorflow>=2.0.0->turicreate) (1.5.0)
Collecting oauthlib>=3.0.0
  Downloading oauthlib-3.1.0-py2.py3-none-any.whl (147 kB)
     |████████████████████████████████| 147 kB 37 kB/s eta 0:00:01    |███████████████▋                | 71 kB 37 kB/s eta 0:00:03
?25hCollecting pyasn1<0.5.0,>=0.4.6
  Downloading pyasn1-0.4.8-py2.py3-none-any.whl (77 kB)
     |████████████████████████████████| 77 kB 59 kB/s eta 0:00:01
?25hRequirement already satisfied: zipp>=0.5 in /opt/anaconda3/lib/python3.7/site-packages (from importlib-metadata; python_version < "3.8"->markdown>=2.6.8->tensorboard<2.3.0,>=2.2.0->tensorflow>=2.0.0->turicreate) (2.2.0)
Building wheels for collected packages: resampy, termcolor, absl-py
  Building wheel for resampy (setup.py) ... ?25ldone
?25h  Created wheel for resampy: filename=resampy-0.2.1-py3-none-any.whl size=320848 sha256=13ea513477f71d513b03a44efe8db091cffc1adaab907123f0fa8d5babfdbeaf
  Stored in directory: /Users/datalab/Library/Caches/pip/wheels/71/74/53/d5ceb7c5ee7a168c7d106041863e71ac3273f4a4677743a284
  Building wheel for termcolor (setup.py) ... ?25ldone
?25h  Created wheel for termcolor: filename=termcolor-1.1.0-py3-none-any.whl size=4830 sha256=f6234bda25caf8e0d32efe0478ea409a20565a1652f05830529936dabbebc345
  Stored in directory: /Users/datalab/Library/Caches/pip/wheels/3f/e3/ec/8a8336ff196023622fbcb36de0c5a5c218cbb24111d1d4c7f2
  Building wheel for absl-py (setup.py) ... ?25ldone
?25h  Created wheel for absl-py: filename=absl_py-0.9.0-py3-none-any.whl size=121931 sha256=a0b4551addb6f776d7d3404e842ec5243b8894947bba6be78bc66c56963d37b5
  Stored in directory: /Users/datalab/Library/Caches/pip/wheels/cc/af/1a/498a24d0730ef484019e007bb9e8cef3ac00311a672c049a3e
Successfully built resampy termcolor absl-py
Installing collected packages: protobuf, coremltools, resampy, termcolor, tensorflow-estimator, keras-preprocessing, google-pasta, astunparse, grpcio, gast, opt-einsum, tensorboard-plugin-wit, pyasn1, pyasn1-modules, rsa, google-auth, oauthlib, requests-oauthlib, google-auth-oauthlib, absl-py, markdown, tensorboard, tensorflow, turicreate
Successfully installed absl-py-0.9.0 astunparse-1.6.3 coremltools-3.3 gast-0.3.3 google-auth-1.17.2 google-auth-oauthlib-0.4.1 google-pasta-0.2.0 grpcio-1.29.0 keras-preprocessing-1.1.2 markdown-3.2.2 oauthlib-3.1.0 opt-einsum-3.2.1 protobuf-3.12.2 pyasn1-0.4.8 pyasn1-modules-0.2.8 requests-oauthlib-1.3.0 resampy-0.2.1 rsa-4.6 tensorboard-2.2.2 tensorboard-plugin-wit-1.6.0.post3 tensorflow-2.2.0 tensorflow-estimator-2.2.0 termcolor-1.1.0 turicreate-6.3
import turicreate as tc

下载数据 http://s3.amazonaws.com/dato-datasets/millionsong/10000.txt

#train_file = 'http://s3.amazonaws.com/dato-datasets/millionsong/10000.txt'
train_file = '/Users/datalab/bigdata/cjc/millionsong/song_usage_10000.txt'
sf = tc.SFrame.read_csv(train_file, header=False, delimiter='\t', verbose=False)
sf = sf.rename({'X1':'user_id', 'X2':'music_id', 'X3':'rating'})
train_set, test_set = sf.random_split(0.8, seed=1)
popularity_model = tc.popularity_recommender.create(train_set, 
                                                    'user_id', 'music_id', 
                                                    target = 'rating')
Preparing data set.
    Data has 1599753 observations with 76085 users and 10000 items.
    Data prepared in: 4.15079s
1599753 observations to process; with 10000 unique items.
item_sim_model = tc.item_similarity_recommender.create(train_set, 
                                                       'user_id', 'music_id', 
                                                       target = 'rating', 
                                                       similarity_type='cosine')
Preparing data set.
    Data has 1599753 observations with 76085 users and 10000 items.
    Data prepared in: 3.7942s
Training model from provided data.
Gathering per-item and per-user statistics.
+--------------------------------+------------+
| Elapsed Time (Item Statistics) | % Complete |
+--------------------------------+------------+
| 7.569ms                        | 2.5        |
| 90.88ms                        | 100        |
+--------------------------------+------------+
Setting up lookup tables.
Processing data in one pass using dense lookup tables.
+-------------------------------------+------------------+-----------------+
| Elapsed Time (Constructing Lookups) | Total % Complete | Items Processed |
+-------------------------------------+------------------+-----------------+
| 605.016ms                           | 0                | 0               |
| 4.01s                               | 100              | 10000           |
+-------------------------------------+------------------+-----------------+
Finalizing lookup tables.
Generating candidate set for working with new users.
Finished training in 5.31028s
factorization_machine_model = tc.recommender.factorization_recommender.create(train_set, 
                                                                              'user_id', 'music_id',
                                                                              target='rating')
Preparing data set.
    Data has 1599753 observations with 76085 users and 10000 items.
    Data prepared in: 4.32575s
Training factorization_recommender for recommendations.
+--------------------------------+--------------------------------------------------+----------+
| Parameter                      | Description                                      | Value    |
+--------------------------------+--------------------------------------------------+----------+
| num_factors                    | Factor Dimension                                 | 8        |
| regularization                 | L2 Regularization on Factors                     | 1e-08    |
| solver                         | Solver used for training                         | sgd      |
| linear_regularization          | L2 Regularization on Linear Coefficients         | 1e-10    |
| max_iterations                 | Maximum Number of Iterations                     | 50       |
+--------------------------------+--------------------------------------------------+----------+
  Optimizing model using SGD; tuning step size.
  Using 199969 / 1599753 points for tuning the step size.
+---------+-------------------+------------------------------------------+
| Attempt | Initial Step Size | Estimated Objective Value                |
+---------+-------------------+------------------------------------------+
| 0       | 25                | No Decrease (230.933 >= 43.5401)         |
| 1       | 6.25              | No Decrease (219.447 >= 43.5401)         |
| 2       | 1.5625            | No Decrease (191.895 >= 43.5401)         |
| 3       | 0.390625          | No Decrease (89.356 >= 43.5401)          |
| 4       | 0.0976562         | 16.0024                                  |
| 5       | 0.0488281         | 11.4371                                  |
| 6       | 0.0244141         | 24.5498                                  |
+---------+-------------------+------------------------------------------+
| Final   | 0.0488281         | 11.4371                                  |
+---------+-------------------+------------------------------------------+
Starting Optimization.
+---------+--------------+-------------------+-----------------------+-------------+
| Iter.   | Elapsed Time | Approx. Objective | Approx. Training RMSE | Step Size   |
+---------+--------------+-------------------+-----------------------+-------------+
| Initial | 387us        | 43.795            | 6.61778               |             |
+---------+--------------+-------------------+-----------------------+-------------+
| 1       | 681.41ms     | 43.5465           | 6.59858               | 0.0488281   |
| 2       | 1.25s        | 40.8911           | 6.39426               | 0.0290334   |
| 3       | 1.97s        | 37.9926           | 6.16345               | 0.0214205   |
| 4       | 2.74s        | 35.4229           | 5.95132               | 0.0172633   |
| 5       | 3.26s        | 32.7792           | 5.72487               | 0.014603    |
| 10      | 5.91s        | 24.5046           | 4.94956               | 0.008683    |
| 15      | 9.07s        | 20.0943           | 4.48185               | 0.00640622  |
| 20      | 11.83s       | 17.639            | 4.19895               | 0.00516295  |
| 25      | 14.27s       | 15.7055           | 3.96197               | 0.00436732  |
| 30      | 16.57s       | 14.3953           | 3.79299               | 0.00380916  |
| 35      | 18.71s       | 13.3639           | 3.65445               | 0.00339327  |
| 40      | 20.92s       | 12.5027           | 3.53463               | 0.00306991  |
| 45      | 23.89s       | 11.8108           | 3.43534               | 0.00281035  |
| 50      | 26.34s       | 9.85419           | 3.13763               | 0.00154408  |
+---------+--------------+-------------------+-----------------------+-------------+
Optimization Complete: Maximum number of passes through the data reached.
Computing final objective value and training RMSE.
       Final objective value: 8.8282
       Final training RMSE: 2.96963
len(train_set)
1599753
result = tc.recommender.util.compare_models(test_set, 
                                            [popularity_model, item_sim_model, factorization_machine_model],
                                            user_sample=.5, skip_set=train_set)
compare_models: using 34354 users to estimate model performance
PROGRESS: Evaluate model M0
recommendations finished on 1000/34354 queries. users per second: 5393.6
recommendations finished on 2000/34354 queries. users per second: 5901.05
recommendations finished on 3000/34354 queries. users per second: 5891.65
recommendations finished on 4000/34354 queries. users per second: 5752.93
recommendations finished on 5000/34354 queries. users per second: 5841.69
recommendations finished on 6000/34354 queries. users per second: 5762.33
recommendations finished on 7000/34354 queries. users per second: 5834.76
recommendations finished on 8000/34354 queries. users per second: 5904.72
recommendations finished on 9000/34354 queries. users per second: 5766.33
recommendations finished on 10000/34354 queries. users per second: 5748.05
recommendations finished on 11000/34354 queries. users per second: 5619.56
recommendations finished on 12000/34354 queries. users per second: 5600.83
recommendations finished on 13000/34354 queries. users per second: 5659.63
recommendations finished on 14000/34354 queries. users per second: 5537.91
recommendations finished on 15000/34354 queries. users per second: 5566.55
recommendations finished on 16000/34354 queries. users per second: 5566.55
recommendations finished on 17000/34354 queries. users per second: 5541.39
recommendations finished on 18000/34354 queries. users per second: 5537.43
recommendations finished on 19000/34354 queries. users per second: 5494.31
recommendations finished on 20000/34354 queries. users per second: 5540.8
recommendations finished on 21000/34354 queries. users per second: 5567.68
recommendations finished on 22000/34354 queries. users per second: 5596
recommendations finished on 23000/34354 queries. users per second: 5594.48
recommendations finished on 24000/34354 queries. users per second: 5551.09
recommendations finished on 25000/34354 queries. users per second: 5561.67
recommendations finished on 26000/34354 queries. users per second: 5526.95
recommendations finished on 27000/34354 queries. users per second: 5465.2
recommendations finished on 28000/34354 queries. users per second: 5437.18
recommendations finished on 29000/34354 queries. users per second: 5444.31
recommendations finished on 30000/34354 queries. users per second: 5452.98
recommendations finished on 31000/34354 queries. users per second: 5430.37
recommendations finished on 32000/34354 queries. users per second: 5407.57
recommendations finished on 33000/34354 queries. users per second: 5386.39
recommendations finished on 34000/34354 queries. users per second: 5405.75
Precision and recall summary statistics by cutoff
+--------+------------------------+------------------------+
| cutoff |     mean_precision     |      mean_recall       |
+--------+------------------------+------------------------+
|   1    | 0.00040752168597543237 | 7.081875226074056e-05  |
|   2    | 0.0004075216859754322  | 0.00011226035987334286 |
|   3    | 0.0003104927131241391  | 0.00012739048212345128 |
|   4    | 0.00034202712930080884 | 0.0002353030396709293  |
|   5    | 0.00046573906968620603 | 0.00046444872494981606 |
|   6    | 0.0004414818264733844  | 0.0005228618282305431  |
|   7    |  0.00043247199328005   | 0.0005736297245772213  |
|   8    | 0.0004075216859754325  | 0.0006087229198555468  |
|   9    | 0.0004528018733060353  | 0.0007391431080220185  |
|   10   | 0.00043080863945974164 | 0.0007735092654785608  |
+--------+------------------------+------------------------+
[10 rows x 3 columns]


Overall RMSE: 5.9110406201585715

Per User RMSE (best)
+-------------------------------+------+-------+
|            user_id            | rmse | count |
+-------------------------------+------+-------+
| cafbf96566378466408b7b3c76... | 0.0  |   1   |
+-------------------------------+------+-------+
[1 rows x 3 columns]


Per User RMSE (worst)
+-------------------------------+-------------------+-------+
|            user_id            |        rmse       | count |
+-------------------------------+-------------------+-------+
| 38767872c514c1b43bab5c7b21... | 341.2071760874715 |   2   |
+-------------------------------+-------------------+-------+
[1 rows x 3 columns]


Per Item RMSE (best)
+--------------------+---------------------+-------+
|      music_id      |         rmse        | count |
+--------------------+---------------------+-------+
| SOXDPFW12A81C2319B | 0.07352941176470584 |   5   |
+--------------------+---------------------+-------+
[1 rows x 3 columns]


Per Item RMSE (worst)
+--------------------+--------------------+-------+
|      music_id      |        rmse        | count |
+--------------------+--------------------+-------+
| SOPKTFQ12A67021600 | 124.75180499529567 |   9   |
+--------------------+--------------------+-------+
[1 rows x 3 columns]

PROGRESS: Evaluate model M1
recommendations finished on 1000/34354 queries. users per second: 5825.84
recommendations finished on 2000/34354 queries. users per second: 5301.82
recommendations finished on 3000/34354 queries. users per second: 5419.21
recommendations finished on 4000/34354 queries. users per second: 5627.83
recommendations finished on 5000/34354 queries. users per second: 5771.93
recommendations finished on 6000/34354 queries. users per second: 5585.3
recommendations finished on 7000/34354 queries. users per second: 5365.4
recommendations finished on 8000/34354 queries. users per second: 5147.5
recommendations finished on 9000/34354 queries. users per second: 5252.85
recommendations finished on 10000/34354 queries. users per second: 5301.11
recommendations finished on 11000/34354 queries. users per second: 5257.75
recommendations finished on 12000/34354 queries. users per second: 5181.17
recommendations finished on 13000/34354 queries. users per second: 5139.25
recommendations finished on 14000/34354 queries. users per second: 5155.66
recommendations finished on 15000/34354 queries. users per second: 4967.57
recommendations finished on 16000/34354 queries. users per second: 4921.83
recommendations finished on 17000/34354 queries. users per second: 4990.4
recommendations finished on 18000/34354 queries. users per second: 5068.35
recommendations finished on 19000/34354 queries. users per second: 5140.77
recommendations finished on 20000/34354 queries. users per second: 5213.8
recommendations finished on 21000/34354 queries. users per second: 5184.08
recommendations finished on 22000/34354 queries. users per second: 5094.51
recommendations finished on 23000/34354 queries. users per second: 5146.65
recommendations finished on 24000/34354 queries. users per second: 5170.89
recommendations finished on 25000/34354 queries. users per second: 5200.77
recommendations finished on 26000/34354 queries. users per second: 5183.28
recommendations finished on 27000/34354 queries. users per second: 5222.37
recommendations finished on 28000/34354 queries. users per second: 5251.88
recommendations finished on 29000/34354 queries. users per second: 5272.72
recommendations finished on 30000/34354 queries. users per second: 5288.9
recommendations finished on 31000/34354 queries. users per second: 5280.69
recommendations finished on 32000/34354 queries. users per second: 5272.44
recommendations finished on 33000/34354 queries. users per second: 5320.29
recommendations finished on 34000/34354 queries. users per second: 5353.61
Precision and recall summary statistics by cutoff
+--------+----------------------+----------------------+
| cutoff |    mean_precision    |     mean_recall      |
+--------+----------------------+----------------------+
|   1    | 0.12356639692612195  | 0.027394188372094497 |
|   2    | 0.10617395354252777  |  0.0450683741242837  |
|   3    | 0.09521453105897464  | 0.059418188595173727 |
|   4    | 0.08673662455609242  | 0.07079052813704197  |
|   5    | 0.08053792862548775  | 0.08078157252034639  |
|   6    | 0.07570685606722138  | 0.09033226689608059  |
|   7    | 0.07142441304402039  | 0.09836168110578075  |
|   8    | 0.06780869767712655  | 0.10631301678527204  |
|   9    | 0.06465687320900597  |  0.1134367927631828  |
|   10   | 0.061861791931070574 | 0.11986145197856829  |
+--------+----------------------+----------------------+
[10 rows x 3 columns]


Overall RMSE: 6.6935880472475

Per User RMSE (best)
+-------------------------------+--------------------+-------+
|            user_id            |        rmse        | count |
+-------------------------------+--------------------+-------+
| f015c8ec1487d172a76e8af6fd... | 0.0991658329963685 |   1   |
+-------------------------------+--------------------+-------+
[1 rows x 3 columns]


Per User RMSE (worst)
+-------------------------------+-------------------+-------+
|            user_id            |        rmse       | count |
+-------------------------------+-------------------+-------+
| 38767872c514c1b43bab5c7b21... | 346.8117700274172 |   2   |
+-------------------------------+-------------------+-------+
[1 rows x 3 columns]


Per Item RMSE (best)
+--------------------+--------------------+-------+
|      music_id      |        rmse        | count |
+--------------------+--------------------+-------+
| SOAVQRP12A8C13120B | 0.8293558937901314 |   4   |
+--------------------+--------------------+-------+
[1 rows x 3 columns]


Per Item RMSE (worst)
+--------------------+-------------------+-------+
|      music_id      |        rmse       | count |
+--------------------+-------------------+-------+
| SOPKTFQ12A67021600 | 128.8226138396836 |   9   |
+--------------------+-------------------+-------+
[1 rows x 3 columns]

PROGRESS: Evaluate model M2
recommendations finished on 1000/34354 queries. users per second: 5658.25
recommendations finished on 2000/34354 queries. users per second: 6360.82
recommendations finished on 3000/34354 queries. users per second: 6111.59
recommendations finished on 4000/34354 queries. users per second: 5757.38
recommendations finished on 5000/34354 queries. users per second: 5756.7
recommendations finished on 6000/34354 queries. users per second: 5927.31
recommendations finished on 7000/34354 queries. users per second: 6043.27
recommendations finished on 8000/34354 queries. users per second: 6053.29
recommendations finished on 9000/34354 queries. users per second: 6008.53
recommendations finished on 10000/34354 queries. users per second: 6029.76
recommendations finished on 11000/34354 queries. users per second: 5973.13
recommendations finished on 12000/34354 queries. users per second: 5982.24
recommendations finished on 13000/34354 queries. users per second: 5945.1
recommendations finished on 14000/34354 queries. users per second: 5940.6
recommendations finished on 15000/34354 queries. users per second: 5976.12
recommendations finished on 16000/34354 queries. users per second: 5915.28
recommendations finished on 17000/34354 queries. users per second: 5818.41
recommendations finished on 18000/34354 queries. users per second: 5838.5
recommendations finished on 19000/34354 queries. users per second: 5876.56
recommendations finished on 20000/34354 queries. users per second: 5943.82
recommendations finished on 21000/34354 queries. users per second: 5952.95
recommendations finished on 22000/34354 queries. users per second: 5934.66
recommendations finished on 23000/34354 queries. users per second: 5937.77
recommendations finished on 24000/34354 queries. users per second: 5984.55
recommendations finished on 25000/34354 queries. users per second: 5983.3
recommendations finished on 26000/34354 queries. users per second: 5979.39
recommendations finished on 27000/34354 queries. users per second: 5971.13
recommendations finished on 28000/34354 queries. users per second: 6012.81
recommendations finished on 29000/34354 queries. users per second: 6001.44
recommendations finished on 30000/34354 queries. users per second: 6024.65
recommendations finished on 31000/34354 queries. users per second: 5997.94
recommendations finished on 32000/34354 queries. users per second: 5964.91
recommendations finished on 33000/34354 queries. users per second: 5996.43
recommendations finished on 34000/34354 queries. users per second: 6000.34
Precision and recall summary statistics by cutoff
+--------+-----------------------+------------------------+
| cutoff |     mean_precision    |      mean_recall       |
+--------+-----------------------+------------------------+
|   1    |  0.000611282528963149 | 0.00015967112604116908 |
|   2    | 0.0005094021074692894 | 0.00020969792246105614 |
|   3    | 0.0005142535561118539 | 0.0002897975387832757  |
|   4    | 0.0005312336263608328 | 0.0004549968288297451  |
|   5    | 0.0005414216685102181 | 0.0005703895459784241  |
|   6    |  0.000548213696609806 | 0.0007142050704628652  |
|   7    | 0.0005738570680062202 | 0.0008314573503625282  |
|   8    | 0.0005930895965535321 | 0.0009602957110903129  |
|   9    | 0.0006015796316780176 | 0.0011215295194248998  |
|   10   | 0.0005967281830354553 | 0.0012125256843862468  |
+--------+-----------------------+------------------------+
[10 rows x 3 columns]


Overall RMSE: 7.641661679566184

Per User RMSE (best)
+-------------------------------+-----------------------+-------+
|            user_id            |          rmse         | count |
+-------------------------------+-----------------------+-------+
| 220f26b368277020c4e685351a... | 5.398923054400484e-06 |   1   |
+-------------------------------+-----------------------+-------+
[1 rows x 3 columns]


Per User RMSE (worst)
+-------------------------------+-------------------+-------+
|            user_id            |        rmse       | count |
+-------------------------------+-------------------+-------+
| 38767872c514c1b43bab5c7b21... | 360.3407134339816 |   2   |
+-------------------------------+-------------------+-------+
[1 rows x 3 columns]


Per Item RMSE (best)
+--------------------+----------------------+-------+
|      music_id      |         rmse         | count |
+--------------------+----------------------+-------+
| SOAESGK12A8C138488 | 0.028188730543956098 |   1   |
+--------------------+----------------------+-------+
[1 rows x 3 columns]


Per Item RMSE (worst)
+--------------------+--------------------+-------+
|      music_id      |        rmse        | count |
+--------------------+--------------------+-------+
| SOPKTFQ12A67021600 | 126.00562214029682 |   9   |
+--------------------+--------------------+-------+
[1 rows x 3 columns]
K = 10
users = gl.SArray(sf['user_id'].unique().head(100))
recs = item_sim_model.recommend(users=users, k=K)
recs.head()
user_id music_id score rank
279292bb36dbfc7f505e36ebf
038c81eb1d1d63e ...
SOXUQNR12AF72A69D6 3.022422651449839 1
279292bb36dbfc7f505e36ebf
038c81eb1d1d63e ...
SOUFAZA12AC3DFAB20 1.3368427753448486 2
279292bb36dbfc7f505e36ebf
038c81eb1d1d63e ...
SOSFSTC12A8C141219 1.091982126235962 3
279292bb36dbfc7f505e36ebf
038c81eb1d1d63e ...
SOVIWFP12A58A7D1BD 1.045163869857788 4
279292bb36dbfc7f505e36ebf
038c81eb1d1d63e ...
SOBMTQD12AB01833D0 1.0294516881306965 5
279292bb36dbfc7f505e36ebf
038c81eb1d1d63e ...
SOCMNRG12AB0189D3F 0.9756437937418619 6
279292bb36dbfc7f505e36ebf
038c81eb1d1d63e ...
SOXOHUM12A67ADC826 0.9506873289744059 7
279292bb36dbfc7f505e36ebf
038c81eb1d1d63e ...
SOWBFVW12A6D4F612B 0.9092370669047037 8
279292bb36dbfc7f505e36ebf
038c81eb1d1d63e ...
SOXFYTY127E9433E7D 0.8977278073628744 9
279292bb36dbfc7f505e36ebf
038c81eb1d1d63e ...
SOYBLYP12A58A79D32 0.8970928192138672 10
[10 rows x 4 columns]