Rank Hotness With Newton's Law of Cooling

Greg A. writes,

I enjoyed your post on how to not sort by average rating. I was wondering if you have any experience with sorting based on average votes over time to get a hotness rating.

Greg, I assume you mean "hotness" in the Reddit sense rather than the HotOrNot sense.

Fig. 1: what Greg is talking about

Fig. 2: not what Greg is talking about

Neither concept of hotness is defined with any rigor, but I imagine a "what‘s hot" list should show what items, discussions, or products have gotten a lot of recent activity. I‘m not going to analyze all the hotness algorithms out there, just my favorite.

I recommend a technique called exponential decay. Exponential decay has three components:

  1. Each new item has an initial "temperature" reflecting its hotness
  2. The temperature is increased by a fixed amount every time someone gives the item a thumbs-up
  3. The temperature gradually drops down over time

A hot list then sorts items by temperature. As we‘ll see, exponential decay is great because:

  1. It‘s easy to understand
  2. It goes easy on the database
  3. It can work without modification as your site gets more popular

First let‘s talk more about the mechanics of exponential decay. Bumping up an item‘s temperature in response to new activity is the easy part; but how do you figure out how much it has cooled? You first need to decide on a cooling rate, that is, how many hours it should take for the temperature T to fall by roughly half. With that in hand, you calculate:

(Current T) = (Last recorded T) × exp( -(Cooling rate) × (Hours since last recorded T) )

(The exp function means take Euler‘s number e=2.71828... to a power.)

The nice thing about this method is that you only need to write to the database when you‘re incrementing the temperature. The rest of the time, for example when you‘re sorting a list of items to display, you can calculate every item‘s current temperature just based on its last reading, no matter how many hours ago.That is, you don‘t need to constantly update the database, and you never need to process the rating history!

If you play with the decay parameters a bit, you should be able to show the right mix of new and hot items at the top of your list. With a small initial temperature, you‘ll see fewer brand-new items; with a smaller decay rate, popular items will tend to linger longer.

Graphically, here‘s how the temperature will drop over time. The shape of the curve is always the same for exponential decay; the parameters only determine the scale of the axes.

The choice of parameters is ultimately be a judgment call (or not; ask me sometime about A/B testing). As your site gets more popular, you may want to tweak the parameters, but the same parameters should work fine over a large range of site activity (i.e., no matter whether the top item has a temperature of 10 degrees or 10,000 degrees). Because the decay is exponential rather than quadratic or linear, even extremely popular items won‘t get "stuck" at the top for too long after their popularity has peaked.

So in a nutshell, you will:

  1. Pick an initial temperature for new items
  2. Pick a cooling rate
  3. Pick a temperature increment
  4. When there is new activity on an item, calculate the current temperature, then increment and record it along with the current time
  5. Sort items based on the current temperature using the formula above

And voila, you‘ve got yourself a "hot" list. Another good use of this algorithm would be listing active discussion threads in an online forum (where each reply increases the temperature).

One last note about exponential decay: besides being easy to compute, it‘s also the law that governs how any hot item cools down, from a hot brick to a hot tamale. It‘s called Newton‘s Law of Cooling.

REFERENCES

Exponential decay (Wikipedia)

Newton‘s Law of Cooling (Wikipedia)

Rank Hotness With Newton's Law of Cooling

时间: 2024-11-11 11:55:40

Rank Hotness With Newton's Law of Cooling的相关文章

Temperature hdu 3477

Temperature Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 K (Java/Others)Total Submission(s): 650    Accepted Submission(s): 208 Problem Description Many people like summer as summer has a lot of advantages, but on the other han

\(\frac 12\)

\(r=\frac{l}{e\cos \theta+1}\), where $l$ is the semi-latus rectum, and \(e\) is the eccentricity. so \(\frac{1}{r}=\frac el\cos\theta+\frac 1l\). Since the most common orbits of heavenly bodies are circles and ellipses, we should be able to construc

Zipf’s Law

Let f(w) be the frequency of a word w in free text. Suppose that all the words of a text are ranked according to their frequency, with the most frequent word first. Zipf’s Law states that the frequency of a word type is inversely proportional to its

Learning To Rank之LambdaMART的前世今生

1.       前言 我们知道排序在很多应用场景中属于一个非常核心的模块,最直接的应用就是搜索引擎.当用户提交一个query,搜索引擎会召回很多文档,然后根据文档与query以及用户的相关程度对文档进行排序,这些文档如何排序直接决定了搜索引擎的用户体验.其他重要的应用场景还有在线广告.协同过滤.多媒体检索等的排序. LambdaMART是Learning To Rank的其中一个算法,适用于许多排序场景.它是微软Chris Burges大神的成果,最近几年非常火,屡次现身于各种机器学习大赛中,

Learning to Rank算法介绍:RankNet,LambdaRank,LambdaMart

之前的博客:http://www.cnblogs.com/bentuwuying/p/6681943.html中简单介绍了Learning to Rank的基本原理,也讲到了Learning to Rank的几类常用的方法:pointwise,pairwise,listwise.前面已经介绍了pairwise方法中的 RankSVM,IR SVM,和GBRank.这篇博客主要是介绍另外三种相互之间有联系的pairwise的方法:RankNet,LambdaRank,和LambdaMart. 1.

LeetCode:Rank Scores - 按分数排名次

1.题目名称 Rank Scores(按分数排名次) 2.题目地址 https://leetcode.com/problems/rank-scores/ 3.题目内容 按分数排名次,如果两个Id的分数一样,那么他们的名次是一样的,排名从1开始.注意,每组分数的名次,都是上一组分数名次加一. 例如,有这样一组数据: +----+-------+ | Id | Score | +----+-------+ | 1  | 3.50  | | 2  | 3.65  | | 3  | 4.00  | | 

【学习排序】 Learning to Rank 中Listwise关于ListNet算法讲解及实现

前一篇文章"Learning to Rank中Pointwise关于PRank算法源码实现"讲述了基于点的学习排序PRank算法的实现.该篇文章主要讲述Listwise Approach和基于神经网络的ListNet算法及Java实现.包括: 1.基于列的学习排序(Listwise)介绍 2.ListNet算法介绍 3.ListNet算法Java实现 LTR中单文档方法是将训练集里每一个文档当做一个训练实例,文档对方法是将同一个查询的搜索结果里任意两个文档对作为一个训练实例,文档列方法

openstack newton 配置glusterfs 作cinder backend

一.搭建gluster 1.节点准备 hostname ip 数据盘vdb g0 192.168.10.10 10G g1 192.168.10.11 10G g2 192.168.10.12 10G 2.安装glusterfs yum install centos-release-gluster310 yum install glusterfs-server 3.创建glusterfs,登录g0 gluster peer probe g1 gluster peer probe g2 4.对vd

HDU 1811 Rank of Tetris(并查集按秩合并+拓扑排序)

Rank of Tetris Time Limit: 1000/1000 MS (Java/Others)    Memory Limit: 32768/32768 K (Java/Others) Total Submission(s): 9267    Accepted Submission(s): 2668 Problem Description 自从Lele开发了Rating系统,他的Tetris事业更是如虎添翼,不久他遍把这个游戏推向了全球. 为了更好的符合那些爱好者的喜好,Lele又想