In this example, I use some stata code and synthetic data to show the implications of the proposed *multiplier-based* scoring system for Yahoo! NCAA brackets. I refer to this multiplier-based scoring alternative as a risk-taking-oriented scoring method since it encourages upset-picking in certain rounds of play. Let's take a look ...

I first create a fake dataset of 32 match-ups (so half of the bracket) contrasting the seed of each pair of teams facing off, and then I'll extend out these match ups across 6 simulated rounds of tourney play. For each of these first 64 games, I've populated the seed of team1 and team2.

Click or toggle the button to see the code for generating this synthetic dataset. The full .do file for this sandbox is linked at the bottom of this post.

. clear . set obs 32 //64 games, 1 per dyad number of observations (_N) was 0, now 32 . g division = mod(_n, 4)+1 //create 4 divisions . bys division: g team1 = 17-_n . bys division: g team2 = _n . ta division //here are the 4 divisions division | Freq. Percent Cum. ------------+----------------------------------- 1 | 8 25.00 25.00 2 | 8 25.00 50.00 3 | 8 25.00 75.00 4 | 8 25.00 100.00 ------------+----------------------------------- Total | 32 100.00 . lab def div 1 "Division1" 2 "Division2" /// > 3 "Division3" 4 "Division4", modify . lab val division div . **mix it up . tempvar x . g `x' = int(runiform()*4) . bys division (`x'): replace team1 = team2[_N] if _n==1 (4 real changes made) . bys division (`x'): replace team2 = team1[_N] if _n==1 (4 real changes made)

After running the code above, here's what our (randomly-created) group of matchups look like. The figure below shows the match-ups in Division 1; that is, it compares the seed/rank of each team dyad in each game.

. pairplot team1 team2 if division==1, hor ytitle(Game) lwidth(vthick) lcolor(ebblue%60) xtitle(Team1 and Team2 rank compari > son) ms(O) y2(ms(O) msize(large) ) msize(large) title(Division 1)

Next, let's (randomly) pick winners and assign points for the first round (**Round of 64**). Under this set up we are always picking team 1 to win (team 1 is our selected team so any mulitiplier is relative to our team 1 selection's seed differential), and then we randomly assign whether our team won and then give points and a multiplier accordingly.

In all of the output and figures that follow labels that start with `points`

(like **points64**) refer to standard scoring while labels that start with `mult`

(like **mult64**) refer to riskier scoring using the *multiplier*. The suffix of the label `points`

in the output below refers to the round of play (*viz*. 64, 32, 16, 8, 4, 2).

The multiplier in the code below (used to produce the `mult64`

variable) is applied using the formula:

where, a winner is picked randomly (from a binomial distribution where the probability of a Win is .505) and then this winner wins the **round points** (for round 64 it's 2 points) and then those points are added to the multiplier (**M**=.5) times the difference in match up seed. (note: I used .5 as the first round multiplier for this sandbox example only, but in the scoring on Yahoo I used a multiplier of 1 for Round64)

. g points64 = 2 if rbinomial(1, .505) (17 missing values generated) . g mult64 = points64+ (.5*(team2-team1)) if !mi(points64) & team1<team2 (30 missing values generated) . replace mult64 = points64 if !mi(points64) & mi(mult64) (13 real changes made) . edrop, temp

This table and figure shows the results of early round scoring with a small multiplier (.5). The multiplier created a small degree of deviation from standard scoring, but you'll see in the following sections that these results are magnified in future rounds where the multipliers are higher (because it's harder (more risky) to choose an upset in higher rounds).

. **! THIS LISTS WINNING GAMES ONLY !** . l if !mi(points64) , sepby(division) +-----------------------------------------------+ | division team1 team2 points64 mult64 | |-----------------------------------------------| 1. | Division1 6 11 2 4.5 | 2. | Division1 10 7 2 2 | 3. | Division1 14 3 2 2 | 5. | Division1 9 8 2 2 | 7. | Division1 15 2 2 2 | 8. | Division1 11 6 2 2 | |-----------------------------------------------| 9. | Division2 5 12 2 5.5 | 10. | Division2 10 7 2 2 | 12. | Division2 11 6 2 2 | 13. | Division2 15 2 2 2 | |-----------------------------------------------| 18. | Division3 13 4 2 2 | 21. | Division3 14 3 2 2 | |-----------------------------------------------| 28. | Division4 10 7 2 2 | 30. | Division4 9 8 2 2 | 32. | Division4 13 4 2 2 | +-----------------------------------------------+

. pairplot points64 mult64 , hor ytitle(Winning games only) lwidth(vthick) lcolor(ebblue%60) xtitle(Standard and Risky scor > e comparison) ms(O) y2(ms(O) msize(large) ) msize(large) title(Division 1) title(Standard versus risky scoring by division, > color(ebblue) size(medsmall))

The next code block that you can toggle below continues on the calculation of winners for each of the remaining rounds of the tournament, assigning standard and multiplier/risky scores for each winning round.

. **more rounds . *-- scores and multipliers per round --* . local scores "3 5 8 13 21" . local multipliers "2 2 3 3 1" . *^^^ feel free to change / play with these . . loc i = 1 . foreach j in 32 16 8 4 { . **mix it up . tempvar x . g `x' = int(runiform()*4) . bys division (`x'): replace team1 = team2[_N] if _n==1 . bys division (`x'): replace team2 = team1[_N] if _n==1 . ** . g points`j' = `:word `i' of `scores'' if rbinomial(1, .505) . g mult`j' = points`j'+ (`:word `i' of `multipliers''*(team2-team1)) /// > if !mi(points`j') & team1<team2 . replace mult`j' = points`j' if !mi(points`j') & mi(mult`j') . loc `++i' //iterate . keep if !mi(points`j') //drop non-winners . di _N . }

This final figure shows the winner (in terms of risky scoring, not standard scoring) vs. non-winner group(s). The important part here is to notice how across the rounds, the risky-scoring winner had a few upset picks that elevate them above the other players who would have won via standard scoring. It's still possible for standard scoring to beat risky scoring if there are few & earlier-round upsets, but it's less likely. In simulations of 1000 of these scenarios, standard scoring won 12.5% of the time (essentially when there were few upsets or upset-picks in later rounds). However, we can expect that this percentage would be higher under regular conditions, *i.a.*, non-random guessing, non-50/50 win rate (rather, if I had the time it'd make sense to weight my random probability of a win above by some skewness metric of the seed differential), actual risk taking or averseness in later rounds (the computer had more fortitude to select lower seeds in higher rounds!).

. egen simplescoring = rowtotal(points*) . egen riskyscoring = rowtotal(mult*) . cap lab var simplescoring `"{bf:Simple} Scoring Total"' . cap lab var riskyscoring `"{bf:Risky} Scoring Total"' . egen winnersimplescoring = rank(simple) , u . egen winnerriskyscoring = rank(risky) , u . for X in var winnersimplescoring winnerriskyscoring : replace X = 4-X -> replace winnersimplescoring = 4-winnersimplescoring (2 real changes made) -> replace winnerriskyscoring = 4-winnerriskyscoring (2 real changes made) . sort winner* . l winnersimple simple winnerrisky risky +-------------------------------------------+ | w~simp~g simple~g w~risk~g riskys~g | |-------------------------------------------| 1. | 1 31 2 31 | 2. | 2 29 3 29 | 3. | 3 29 1 65 | +-------------------------------------------+ . qui su winnerrisky, d . lab def win `r(min)' "Winner(Risky scoring)" `=`r(min)'+1' "Not winner" `=`r(min)'+2' "Not winner" `=`r(min)'+3' "Not winn > er" `=`r(min)'+4' "Not winner" , modify . lab val winnerrisky win . statplot *64 *32 *16 *8 points4 mult4 simplescoring riskyscoring , xpose over(winnerrisky ,gap(20)) bargap(40) ytitle(#Poi > nts) blabel(bar, format(%3.0f) color(gs8) size(vsmall)) title(`"Total # points for winner and other players in last round, by > scoring regime"', size(small ) color(ebblue))

Note that the winner under the risky scoring method benefited from some upset-picking in earlier rounds that allowed them to win; otherwise, they would have lost if we had used the simple scoring method.

Download a copy of the .do file code to produce this example by clicking HERE