Hypothetical situation, lets say you've got a list of movies that you want to rank in a website or a report or something, and you have user-submitted ratings for them, but some are more popular than others, so your data looks like this:

``````data ratings;
input name \$ rating;
datalines;
Lincoln 9
Lincoln 8
Lincoln 9
Amour 9
Argo 5
Argo 10
;
run;
``````
Obs name rating
1 Lincoln 9
2 Lincoln 8
3 Lincoln 9
4 Amour 9
5 Argo 5
6 Argo 10

The easiest thing to do would be to calculate an average rating for each movie like this:

``````proc sql;
select distinct name, avg(rating) as average
from ratings
group by name
order by average desc;
run;
``````
name average
Amour 9.00
Lincoln 8.67
Argo 7.50

But hey! That's not cool. It looks like Amour wins, because its average rating is 9. Maybe we want to consider Lincoln as better because 3 people think it's very high. A good way to deal with this is by instead taking a Bayesian Average.

This means we're going to add in some "dummy" votes for each movie, who give each movie the average rating a movie gets. How many (C) is a judgement call, the more we add, the harder we make it for an obscure movie to be near the top. Likewise, if a movie's first rating is low, it keeps it from suddenly dropping to the bottom of the list. If we expect thousands of ratings for each movie, a C=1000 might be appropriate. In this example, I use a small C of 10.

``````proc sql;
select avg(rating) into :average
from ratings;
select distinct
name,
(sum(rating) + &average * 10) / (count(*) + 10) as b_average
from ratings
group by name
order by b_average;
quit;
``````
name b_average
Lincoln 8.41
Amour 8.39
Argo 8.19

And look! Lincoln is back on top, since its bayesian average more closely reflects a product of the number of ratings it has and what those ratings are.