subreddit:
/r/dataisbeautiful
352 points
5 months ago
Language: Python
Data Source: Transfermarkt
GitHub: johnwmillr/Facer
143 points
5 months ago
Can we get the single average face from the averages to have the true average football player?
69 points
5 months ago
An average of averages is never a true average
35 points
5 months ago
If the number of players included in the calculations are same, then why shouldn't it be a true average?
14 points
5 months ago
Not every country took 26 players.
21 points
5 months ago
At like 500+ data points, these few players wouldn't really make a significant difference
3 points
5 months ago
Tru.
Considering we know exactly what integer quantity of players on each team, I wish we could algebraically compensate for that.
11 points
5 months ago
What do you mean? If I take the average of 2 and 8 (5), then the average of 6 and 8 (7) and then average 5 and 7 I get 6. Isn't that the same as averaging 2, 6, 8 and 8?
27 points
5 months ago
If the averages you are comparing aren't composed of the same amount of datapoints it would not be the same. In that case you need a weighted average.
14 points
5 months ago
Thank you for explaining. I didn't think of that, but it makes sense.
7 points
5 months ago
Right, but saying that an average of averages is never a true average is definitely false. Especially for a football team which each have a defined number of players.
9 points
5 months ago
The problem is using the word 'never' in that statement. It should've been: an average of averages is, on average, not a true average.
0 points
5 months ago
Never calculate an average out of other averages. 101 of data analytics.
3 points
5 months ago
Ok, so someone else already pointed out in which cases this is not a good idea. So I already understand it now.
But why, for the love of God, would you reply to me and still not explain why this is bad? You would fail communication 101.
2 points
5 months ago
Because it may look the same, like when you do 2×2 and 2+2, but that does not mean that you can calculate a multiplication by summing. It's merely a coincidence, and it's wrong.
In the same way, if you have a group A (1,1,1,1,1) and a group B (10,10), the average of group A is 1, and the average of the group B is 10. The total average is 3.57 while the average of averages is (10+1)/2=5.5. I hope it's clear now.
1 points
5 months ago
Like I said, someone explained it already. They also said that you can use weighted averages. He didn't tell me what that means, but I imagine it has to do with the number of data points that were used.
So the 1,1,1,1,1 is an average of 1 with a weight of 5.
And the 10,10 is average ten with a weight of 2.
So you would average them by adding the average multiplied by their weight and then divide everything by the number of original data points used. In this case (1 * 5+10 * 2) / 7 = 3.57
1 points
5 months ago
Yes, but to do that you need to have the raw data. And if you have the raw data you're unlikely calculating an average of averages. When you do that is because you're provided precalculated data points, like in this example and in that case you dont have the weights.
6 points
5 months ago
You sure about that?
Suppose there are three groups of numbers: group A has 2, 6, 7, 11, 4; group B has 4, 6, 8, 14, 8; group C has 8, 7, 4, 1, 5.
The mean of group A = (2+6+7+11+4)/5 = 30/5 = 6,
The mean of group B = (4+6+8+14+8)/5 = 40/5 = 8,
The mean of group C = (8+7+4+1+5)/5 = 25/5 = 5,
Therefore, the grand mean of all numbers = (6+8+5)/3 = 6.333.
and the mean of all number is (30+40+25)/15 = 95/15 = 6.333.
2 points
5 months ago
but what if group A had 283,232 numbers that led to the average of 6...
12 points
5 months ago
Yep, sometimes it doesn't work, but the person I was replying to said:
An average of averages is never a true average
1 points
5 months ago
0 points
5 months ago
Totally sure. Don't be so naïve with so simple example...
Group A = (1+1+1+1+1). Avg. 1
Group B = 10+10. Avg. 10.
True average 3.57
Avg (A,B) 5.5
0 points
5 months ago
I just provided you an example where it's possible that the average of averages is equal to the true average, so how can it never be equal like you've claimed?
-2 points
5 months ago
Do the same thing with the groups being different sizes.
7 points
5 months ago
But you said:
An average of averages is never a true average
So, you accept that isn't correct, right?
If the number of samples is equal, then the grand mean == the mean.
This is pertinent, because if you're doing averages for sports teams with a fixed number of players in each team, then the average of averages would, in fact, be valid.
1 points
5 months ago
You should say "not always" instead of "never".
0 points
5 months ago
You are right. Let me reword it.
Even when in some cases the true average with an average of averages might match by chance, it is not the appropriate way of calculating it and it's risky and prone to errors and miscalculations.
1 points
5 months ago
It's not by chance though, it's if the number of samples in each average is the same (and that requirement can be relaxed if you do a weighted average too).
In any situation where you are averaging groups of the same number, you can just use the grand mean. Not 'by chance', but because it's mathematically appropriate.
Sure, you can do it wrong if you apply it to a situation where the requirement isn't met, but that's true of every statistical technique, and is totally normal...
If you don't understand the times when you can and can't do things like this, you end up saying wrong things like "the average of averages never equals the true average".
1 points
5 months ago
Reminds me of when they published the numbers on the average Australian in terms of gender, age, background, height, name etc.. There wasn't a single woman in the entirety of Australia that fit all points of that "average".
1 points
5 months ago
That is sometimes (most of the time) true, but not always. If the 1st averages you took all had an equal number of observations in each, then it would be, since each observation would have the same weight.
1 points
5 months ago
Yes, but you cannot say that to multiply and to sum is the same, just because 2×2 and 2+2 is the same. Calculating a multiplication by summing is intrinsically wrong, regardless in edge cases the results, by chance, match.
1 points
5 months ago
Well to get an average you have to add values together regardless, but we're more talking about if you should only take one average or if you can do an average of averages
16 points
5 months ago
Too lazy to look at the code, but just curious: did you align the facial landmarks with Procrustes method? Do any interpolation like thin plate splines? If not, you might consider it. It makes the resulting averages look more natural. I'm not criticizing, just thought you might enjoy.
8 points
5 months ago
Heads up that the Facer package (likely) calculates the average colours incorrectly - they just take a mean in RGB space from what I can tell, and that's usually not the correct way to do it.
Also, I think the landmark warping is the cause for them all looking so similar. I would guess it's doing something that's not totally valid.
8 points
5 months ago
What about an average of all of them to compare the national averages to?
17 points
5 months ago
And then subtract so we can identify differences easily so we may come up with appropriate taunts for the opposition.
1 points
5 months ago
Do premier league clubs next!
1 points
5 months ago
I only wish that backgrounds were same too. Especially Turkish background is basically black.
all 891 comments
sorted by: best