Who are the best NCAA Division I women’s basketball seniors?

Archetypal analysis (Eugster 2012) is a statistical technique for analyzing athletes’ performance. It operates as follows:

- The user prepares a dataset of metrics, one row per athlete. In its simplest form, the metrics are the totals of various box score metrics over a season.
- The
`archetypes::stepArchetypes`

function then steps through a user-specified sequence of archetype counts. At each step, the analyzer minimizes an error criterion, the residual sum of squares (RSS). - When the steps have finished, the user examines a screeplot (Venables and Ripley 2013) of the errors and selects the number of archetypes to use. For basketball this is typically in the range of three to six.
- The result is two tables:
`archetype_parameters`

, the metric values for each archetype, and`player_alphas`

, the athletes’ ratings in terms of the archetypes.

Although the process normally involves a search, a three-archetype model has advantages in interpretation and visualization.

Interpretation: With three archetypes, you get two high-value archetypes and a “bench” archetype. The bench archetype corresponds to lightly-used players. For any player, the ratings for each archetype sum to 1.0. Because of this, the “bench” rating can be used as an overall measure of skill. The best players will have a bench rating of zero.

In the NBA, the two high-value archetypes are usually rim protectors (Shea 2014) like Andre Drummond and Rudy Gobert and floor spacers like James Harden and Damian Lillard. All-stars like Giannis Antetokompou and LeBron James generally have a mix of the two with a bench rating of zero. We will see a similar pattern in the NCAA women’s data.

Visualization: Since the three archetype scores sum to 1.0, we can do a ternary plot. See Archetypal Ballers and Ternary Plots (Borasky 2017) for an overview.

We use the `dfstools`

library package (Borasky 2019) to do the calculations.

We create the input data as follows:

- Read the raw data. This is the season totals for NCAA division I women’s basketball players through the first two games of the Final Four.
- Compute the height in feet and total minutes by parsing text fields.
- Select the relevant variables. We consider only completed actions - games, minutes, two-point and three-point field goals made, free throws made, rebounds, etc. - as relevant. We discard per-minute, per-game and percentages per attempt. And for this report, we only consider seniors.

You can see the distinction between rim protectors and floor spacers in the `total_rebounds`

and `three_point_field_goals`

rows. The archtypal rim protector would score 374.6 rebounds in a season while only making 2.5 three-pointers. Conversely, an archetypal floor spacer would make 93.5 three-point field goals but only 145.3 rebounds. Note also that the rim protectors make more *two-point* field goals than the floor spacers.

`player_alphas`

gives the player name, team, position, and height for each player, followed by the archetype ratings. Since we are looking at prospects for the WNBA draft, we show only the 36 players with the lowest “bench” ratings, since there are 12 WNBA teams and three draft rounds.

This table is sortable - you can click on a column and sort it in ascending or descending order. For example, you can see that the three best rim protectors are Megan Gustafson, Kristine Anigwe and Teaira McGowan. The three best floor spacers are Presley Hudson, Cierra Dillard and Savannah Smith.

It’s also filterable - are you looking for a guard? Enter “G” in the filter box above the `position`

column.

I noted above that the three-archetype model allows visualizing players on a ternary plot. Further, let’s assume we want versatile players - guards who can rebound and forwards who can shoot threes. So let’s look at the top five rim-protecting guards and the top five floor spacing forwards.

Borasky, M. Edward (Ed). 2017. “Archetypal Ballers and Ternary Plots.” https://rpubs.com/znmeb/pdxdataviz20170209.

———. 2019. *Dfstools: Tidy Data Analytics from Sports Data Apis*. https://znmeb.github.io/dfstools.

Eugster, Manuel J. A. 2012. “Performance Profiles Based on Archetypal Athletes.” *International Journal of Performance Analysis in Sport* 12 (1): 166–87. http://epub.ub.uni-muenchen.de/12336/.

Shea, S. M. 2014. *Basketball Analytics: Spatial Tracking*. CreateSpace Independent Publishing Platform.

Venables, W. N., and B. D. Ripley. 2013. *Modern Applied Statistics with S, Chapter 11*. Statistics and Computing. Springer New York.

Text and figures are licensed under Creative Commons Attribution CC BY-SA 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

For attribution, please cite this work as

Borasky (2019, April 9). Borasky Research Journal: The 2019 WNBA Draft - An Archetypal Analysis. Retrieved from https://www.znmeb.mobi/posts/2019-04-09-the-2019-wnba-draft-an-archetypal-analysis/

BibTeX citation

@misc{borasky2019the, author = {Borasky, M. Edward (Ed)}, title = {Borasky Research Journal: The 2019 WNBA Draft - An Archetypal Analysis}, url = {https://www.znmeb.mobi/posts/2019-04-09-the-2019-wnba-draft-an-archetypal-analysis/}, year = {2019} }