new lightfm model with different MAX_SAMPLED(updated 29th,Aug)

I thought the low train AUC was due to the underfitting, but after some experiments I found that it is not as thought.

The low train AUC was caused by the difference between the negative examples we used for training and the ones for evaluating AUC.

the negative examples we used for training is not the same as the negative examples we used for evaluating train auc.

here is the results

 
max_sampled 5 9 10 11 20 25 30
train auc 0.583485 0.582856 0.586314 0.598418 0.596159 0.608818 0.599637
test_auc 0.315679 0.310560 0.319777 0.313288 0.321055 0.287646 0.294142

so the MAX_SAMPLED is not a major cause here

I am thinking it may not be good enough in auc score but probably we can turn to [email protected]?

时间: 2024-08-24 16:19:55

new lightfm model with different MAX_SAMPLED(updated 29th,Aug)的相关文章

new lightfm model with different radius(updated 29th,Aug)

some results running on the linux laptop with the new model: [email protected]5548:~/code$ python3 newlfmodelradius.py /home/liu/.local/lib/python3.5/site-packages/sklearn/cross_validation.py:41: DeprecationWarning: This module was deprecated in vers

Experiments on the NYC dataset(updated 3rd,Aug)

Experiments on the NYC datasets, here is the dataset link: https://sites.google.com/site/yangdingqi/home/foursquare-dataset Forgive me being lazy and uploading a manuscript photo about the preprocessing of the data: The codes are available on the git

What are the advantages of different classification algorithms?

What are the advantages of different classification algorithms? For instance, if we have large training data set with approx more than 10000 instances and more than 100000 features ,then which classifier will be best to choose for classification Want

NopCommerce Html扩展方法Html.Widget

在Nop中有一个Html扩展的类叫HtmlExtensions,主要源码: public static class HtmlExtensions { #region Admin area extensions public static MvcHtmlString Hint(this HtmlHelper helper, string value) { // Create tag builder var builder = new TagBuilder("img"); // Add a

corosync+pacemaker+mysql+drbd 实现mysql的高可用

corosync corosync的由来是源于一个Openais的项目,是Openais的一个子 项目,可以实现HA心跳信息传输的功能,是众多实现HA集群软件中之一,heartbeat与corosync是流行的Messaging Layer (集群信息层)工具.而corosync是一个新兴的软件,相比Heartbeat这款很老很成熟的软件,corosync与Heartbeat各有优势,博主就不在这里比较之间的优势了,corosync相对于Heartbeat只能说现在比较流行. pacemaker

MySQL+DRBD+Corosync+Pacemaker CentOS6.5版

一.DRBD部分配置 1.安装环境说明 node1       192.168.110.141 node2       192.168.110.142 Node1: # sed -i '[email protected]\(HOSTNAME=\).*@\[email protected]' /etc/sysconfig/network # hostname node1.pancou.com # vim /etc/hosts 192.168.110.141 node1.pancou.com nod

Corosync+Pacemaker+DRBD+MySQL 实现高可用(HA)的MySQL集群

大纲一.前言二.环境准备三.Corosync 安装与配置四.Pacemaker 安装与配置五.DRBD 安装与配置六.MySQL 安装与配置七.crmsh 资源管理 推荐阅读: Linux 高可用(HA)集群基本概念详解 http://www.linuxidc.com/Linux/2013-08/88522.htm Linux 高可用(HA)集群之Heartbeat详解 http://www.linuxidc.com/Linux/2013-08/88521.htm 一.前言      前几篇博文

corosync(pacemaker)+drbd+web(apache)

环境:     vm1-hong:172.16.3.2/16     vm2-ning:172.16.3.10/16     VIP:172.16.3.100/16 一.drbd安装: 案例:配置主从primary/secondary的drbd设备(主从节点在高可用集群中,中从节点切换比较慢) 前提:     1.两节点之间必须时间同步.基于主机名能相互通信     2.准备的磁盘设备必须是同样大小的     3.系统架构得一样    包: drbd-8.4.3-33.el6.x86_64.rp

High Availability手册(3): 配置

各种配置在命令行状态下,多用crm进行 Global Cluster Options 这个类型是全局配置,主要包含下面两个: no-quorum-policy quorum的意思是最低法定人数,pacemaker能够继续工作所需要的最少的active的node的个数,这个数是(num of nodes)/2 + 1 如果不能达到法定人数的时候行为如何呢? ignore表示继续运行,如果是两个Node的cluster,只要有一个挂了,就小于最小法定数目了,所有要设为ignore freeze表示已