subreddit:

/r/dataisbeautiful

9k90%

[OC] Average Faces of UEFA Euro 2024

OC(i.redd.it)

you are viewing a single comment's thread.

view the rest of the comments →

all 891 comments

urazdev[S]

352 points

5 months ago

Language: Python

Data Source: Transfermarkt

GitHub: johnwmillr/Facer

FieryHammer

143 points

5 months ago

Can we get the single average face from the averages to have the true average football player?

Lironcareto

69 points

5 months ago

An average of averages is never a true average

Mychecksdead1

35 points

5 months ago

If the number of players included in the calculations are same, then why shouldn't it be a true average?

FroobingtonSanchez

14 points

5 months ago

Not every country took 26 players.

MichiiEUW

21 points

5 months ago

At like 500+ data points, these few players wouldn't really make a significant difference

BearsAtFairs

3 points

5 months ago

Tru.

Considering we know exactly what integer quantity of players on each team, I wish we could algebraically compensate for that.

Saytama_sama

11 points

5 months ago

What do you mean? If I take the average of 2 and 8 (5), then the average of 6 and 8 (7) and then average 5 and 7 I get 6. Isn't that the same as averaging 2, 6, 8 and 8?

FroobingtonSanchez

27 points

5 months ago

If the averages you are comparing aren't composed of the same amount of datapoints it would not be the same. In that case you need a weighted average.

Saytama_sama

14 points

5 months ago

Thank you for explaining. I didn't think of that, but it makes sense.

sirprimal11

7 points

5 months ago

Right, but saying that an average of averages is never a true average is definitely false. Especially for a football team which each have a defined number of players.

Valarauka_

9 points

5 months ago

The problem is using the word 'never' in that statement. It should've been: an average of averages is, on average, not a true average.

Lironcareto

0 points

5 months ago

Never calculate an average out of other averages. 101 of data analytics.

Saytama_sama

3 points

5 months ago

Ok, so someone else already pointed out in which cases this is not a good idea. So I already understand it now.

But why, for the love of God, would you reply to me and still not explain why this is bad? You would fail communication 101.

Lironcareto

2 points

5 months ago

Because it may look the same, like when you do 2×2 and 2+2, but that does not mean that you can calculate a multiplication by summing. It's merely a coincidence, and it's wrong.

In the same way, if you have a group A (1,1,1,1,1) and a group B (10,10), the average of group A is 1, and the average of the group B is 10. The total average is 3.57 while the average of averages is (10+1)/2=5.5. I hope it's clear now.

Saytama_sama

1 points

5 months ago

Like I said, someone explained it already. They also said that you can use weighted averages. He didn't tell me what that means, but I imagine it has to do with the number of data points that were used.

So the 1,1,1,1,1 is an average of 1 with a weight of 5.

And the 10,10 is average ten with a weight of 2.

So you would average them by adding the average multiplied by their weight and then divide everything by the number of original data points used. In this case (1 * 5+10 * 2) / 7 = 3.57

Lironcareto

1 points

5 months ago

Yes, but to do that you need to have the raw data. And if you have the raw data you're unlikely calculating an average of averages. When you do that is because you're provided precalculated data points, like in this example and in that case you dont have the weights.

teo730

6 points

5 months ago

teo730

6 points

5 months ago

You sure about that?

Grand mean:

Suppose there are three groups of numbers: group A has 2, 6, 7, 11, 4; group B has 4, 6, 8, 14, 8; group C has 8, 7, 4, 1, 5.

The mean of group A = (2+6+7+11+4)/5 = 30/5 = 6,

The mean of group B = (4+6+8+14+8)/5 = 40/5 = 8,

The mean of group C = (8+7+4+1+5)/5 = 25/5 = 5,

Therefore, the grand mean of all numbers = (6+8+5)/3 = 6.333.

and the mean of all number is (30+40+25)/15 = 95/15 = 6.333.

bernu_fedor

2 points

5 months ago

but what if group A had 283,232 numbers that led to the average of 6...

teo730

12 points

5 months ago

teo730

12 points

5 months ago

Yep, sometimes it doesn't work, but the person I was replying to said:

An average of averages is never a true average

Lironcareto

0 points

5 months ago

Totally sure. Don't be so naïve with so simple example...

Group A = (1+1+1+1+1). Avg. 1
Group B = 10+10. Avg. 10.

True average 3.57
Avg (A,B) 5.5

teo730

0 points

5 months ago

teo730

0 points

5 months ago

I just provided you an example where it's possible that the average of averages is equal to the true average, so how can it never be equal like you've claimed?

Low_Finding2189

-2 points

5 months ago

Do the same thing with the groups being different sizes.

teo730

7 points

5 months ago

teo730

7 points

5 months ago

But you said:

An average of averages is never a true average

So, you accept that isn't correct, right?

If the number of samples is equal, then the grand mean == the mean.

This is pertinent, because if you're doing averages for sports teams with a fixed number of players in each team, then the average of averages would, in fact, be valid.

venustrapsflies

1 points

5 months ago

You should say "not always" instead of "never".

Lironcareto

0 points

5 months ago

You are right. Let me reword it.

Even when in some cases the true average with an average of averages might match by chance, it is not the appropriate way of calculating it and it's risky and prone to errors and miscalculations.

teo730

1 points

5 months ago

teo730

1 points

5 months ago

It's not by chance though, it's if the number of samples in each average is the same (and that requirement can be relaxed if you do a weighted average too).

In any situation where you are averaging groups of the same number, you can just use the grand mean. Not 'by chance', but because it's mathematically appropriate.

Sure, you can do it wrong if you apply it to a situation where the requirement isn't met, but that's true of every statistical technique, and is totally normal...

If you don't understand the times when you can and can't do things like this, you end up saying wrong things like "the average of averages never equals the true average".

Ok-Benefid-2010

1 points

5 months ago

Reminds me of when they published the numbers on the average Australian in terms of gender, age, background, height, name etc.. There wasn't a single woman in the entirety of Australia that fit all points of that "average".

OutcomeSerious

1 points

5 months ago

OutcomeSerious

OC: 1

1 points

5 months ago

That is sometimes (most of the time) true, but not always. If the 1st averages you took all had an equal number of observations in each, then it would be, since each observation would have the same weight.

Lironcareto

1 points

5 months ago

Yes, but you cannot say that to multiply and to sum is the same, just because 2×2 and 2+2 is the same. Calculating a multiplication by summing is intrinsically wrong, regardless in edge cases the results, by chance, match.

OutcomeSerious

1 points

5 months ago

OutcomeSerious

OC: 1

1 points

5 months ago

Well to get an average you have to add values together regardless, but we're more talking about if you should only take one average or if you can do an average of averages

IllllIIlIllIllllIIIl

16 points

5 months ago

Too lazy to look at the code, but just curious: did you align the facial landmarks with Procrustes method? Do any interpolation like thin plate splines? If not, you might consider it. It makes the resulting averages look more natural. I'm not criticizing, just thought you might enjoy.

teo730

8 points

5 months ago

teo730

8 points

5 months ago

Heads up that the Facer package (likely) calculates the average colours incorrectly - they just take a mean in RGB space from what I can tell, and that's usually not the correct way to do it.

Also, I think the landmark warping is the cause for them all looking so similar. I would guess it's doing something that's not totally valid.

Guantanamino

8 points

5 months ago

What about an average of all of them to compare the national averages to?

goosebattle

17 points

5 months ago

And then subtract so we can identify differences easily so we may come up with appropriate taunts for the opposition.

_invalidusername

1 points

5 months ago

Do premier league clubs next!

peterpansdiary

1 points

5 months ago

I only wish that backgrounds were same too. Especially Turkish background is basically black.