royski
Golden Hoya (over 1000 posts)
Posts: 2,293
|
Post by royski on Nov 23, 2009 19:42:47 GMT -5
|
|
SFHoya99
Blue & Gray (over 10,000 posts)
Posts: 17,744
|
Post by SFHoya99 on Nov 23, 2009 19:56:56 GMT -5
Interesting, though it immediately becomes suspect when I saw they had a 365 games sample.
Umm, there are like 5,000 D-I college basketball games a year. So while it is intriguing, I'm curious as to why the heck you'd limit your sample so very much.
|
|
FLHoya
Diamond Hoya (over 2500 posts)
Proud Member of Generation Burton
Posts: 4,544
|
Post by FLHoya on Nov 23, 2009 20:00:55 GMT -5
So while it is intriguing, I'm curious as to why the heck you'd limit your sample so very much. Probably came up with the idea around the time they started watching the Big Ten games.
|
|
|
Post by Coast2CoastHoya on Nov 23, 2009 21:34:21 GMT -5
Interesting, though it immediately becomes suspect when I saw they had a 365 games sample. Umm, there are like 5,000 D-I college basketball games a year. So while it is intriguing, I'm curious as to why the heck you'd limit your sample so very much. I'm more interested in the fact that they only looked at "lift off" halves, despite how skewed the intentional fouls at the end of games would make the numbers. I wonder how looking at some non-skewed portion of second halves would alter the numbers, if at all. Honest question: why is only 365 games out of 5000 (i.e. 7.3%) make the numbers suspect? What if they only looked at major conferences? Wouldn't that make them more relevant for us? Don't we rely on a ridiculously small sample set (something like 1,000 out of 300 million) for "scientific" political polls? Just curious, 'cause it would be nice to see if the numbers would change if more games had been studied.
|
|
bmartin
Golden Hoya (over 1000 posts)
Posts: 2,459
|
Post by bmartin on Nov 23, 2009 22:16:21 GMT -5
To study the home effect, they would need to look at conference games where the strength of the home and away teams balance out. In nonconference play there are too many guarantee games in which the visitors are totally overmatched. The home team's foul advantage could just be a competitive advantage. Weak visiting teams are likely to foul more if they cannot guard the home team but are themselves very guardable.
|
|
SFHoya99
Blue & Gray (over 10,000 posts)
Posts: 17,744
|
Post by SFHoya99 on Nov 23, 2009 22:59:26 GMT -5
7.3% sample would probably work if you were certain it was a truly random sample.
The reason why political polls suck (or rather, A reason why) is that they assume their sampling is random, when it is impossible in political polls to be random (choosing to answer is self-selection and therefore ruins at shot at claiming randomness). Basically, our news organizations don't do any critical thinking -- which is why those polls are considered "fact."
It's easier to do a random sampling in a study like this one, but because there's several years of PBP data online and people can grab scripts to do the research (and especially on the NBA), I kinda wonder why you would restrict your data set so extremely. I mean, just in the Big East, there were what, 144 Big East games before the tourney? In one year? What if they just took the first 365 games of the year for example?
The other big issue with studies like these is demonstrated by what bmartin brings up. When academics get into sports they often don't know what variables to test and adjust to. Some studies are obviously clueless -- this one seems better -- but it's one of those things where I'd actually like to read the paper and see the methodology before I agreed too much.
I do think it is interesting that the calls almost alternate, and that they seem to naturally even out. I can even see why. But how much of that is in games that are not decided?
And how does that explain this -- the rate of opponent's FT as a % of opponent's FG% by team -- in other words, the fouls that count for something -- ranged from 19.5% to 57.7% last year.
Does that sound like their conclusion that the same number of fouls are going to be called, so just be really aggressive? Connecticut went to a Final Four on the opposite defensive strategy.
That's just one quick example, and why I hate studies like these where they don't explain methodology.
|
|
|
Post by thejerseytornado on Nov 24, 2009 9:47:11 GMT -5
i'm sure the study does explain it's methodology, but the espn article about the study doesn't. If you read the study, i'd bet they explain everything you ask. And the academic you criticize for not knowing about sports is one who played div 3 college ball.
Also, political polls are actually surprisingly reliable--once you aggregate them. fivethirtyeight.com is an excellent source of said aggregation.
Finally, % of the population is not the important factor in whether or not a sample is legitimate. It's sample size that is more important. And 365 games isn't the sample size, it's each foul that is treated as an incident that is the basis of the sample. The sample's quite large.
|
|
SFHoya99
Blue & Gray (over 10,000 posts)
Posts: 17,744
|
Post by SFHoya99 on Nov 24, 2009 10:51:17 GMT -5
i'm sure the study does explain it's methodology, but the espn article about the study doesn't. If you read the study, i'd bet they explain everything you ask. And the academic you criticize for not knowing about sports is one who played div 3 college ball. Touchy, much? My direct quote relevant to your criticism were: "The other big issue with studies like these is demonstrated by what bmartin brings up. When academics get into sports they often don't know what variables to test and adjust to. Some studies are obviously clueless -- this one seems better -- but it's one of those things where I'd actually like to read the paper and see the methodology before I agreed too much. " In other words, yes, I understand that the ESPN article is not the study itself, and yes, I'd like to read the study before I pass judgement. And yes, this guy does seem like he knows more, but I'd still like to read the study. I read fivethirtyeight all the time, and I doubt Nate Silver would agree with your conclusion about political polls. What I was referring to is the ridiculous +- error bars they put on them, making the assumption that they are random. They aren't, and those polls are awful. It is 100% the reason why the networks screwed up Florida in 2000 -- not understanding that states don't report vote totals randomly, it is geographically and thus demographically slanted, for instance. The whole reason Silver aggregates is that he doesn't trust the polls and their methodology. Of course, he doesn't just aggregate. Aggregation would be enough if the samples were random. But they aren't just tiny samples -- they are flawed -- and because of that, Silver: - Aggregates across polls
- Aggregates across times of polls
- Adjusts for left/right bias
- Regresses the findings based on historical voting
- Regresses the findings based on a series of demographic data
So he's not just aggregating. Finally, he still has some issues with the randomness of the sample -- for example, the likely voter issue, or the fact that cell phones aren't on here, so younger people are less likely to get polled. And all that's with a question -- who are you voting for in a hugely involved election (President) -- that isn't easily biased. Once you get into policy, how you ask the question is enormous and completely skews numbers. Even approval ratings are easily skewed. Finally, you have to factor outright fraud, which Silver has found now as well. So no, I wouldn't say political polls are done for accuracy. They aren't intended that way. Silver does a fantastic job, but the polls posted on CNN or Fox or whatever are their to drive news (for the network - which is why they always pick the 1 outlier poll of 100 existing) and public opinion (by the biased people who commissioned them). Are you really arguing that the % of games is not somehow proportional to the % of plays? You determine a proper sample size based partially on population. But actually, I'm not arguing that the sample size calculation is wrong. I'm arguing that a) I don't know if it is truly random, give the small numbers they used and b) given the information out there, I'm unsure as to why they used such a small sample. I get why political polls do -- you run on short timelines and you need someone to actually talk to you. It costs a lot of money relatively speaking. But this study? PBP data is out there. It's usually in the same format. Ask CO -- there's data issues, but I think he found that something like 15 of the 16 BE teams, for example, post the same PBP in the same format. You write a script to grab the data and the work is all upfront. After that, you could increase your sample size. I'm reserving judgement until then. But I've seen enough studies with shocking results that don't hold up. "Eh, officials even games when they are out of reach" doesn't have the same pull, does it? I see the full season foul numbers, and it doesn't agree with the concept that everyone fouls at the same rate.
|
|
|
Post by thejerseytornado on Nov 24, 2009 11:32:55 GMT -5
lol if you think my reply was touchy.
anyway...you speak about "reading the study before passing judgement" yet you take significant shots at its (unknown) methodology and author (and were wrong about that one) without reading it. that doesn't pass the smell test to me.
The networks screwed up in Florida because occasional screwups will happen in random sampling. Even completely random sampling. happens, but rarely. Fraud happens, but rarely. Aggregating semi-random polls and adjusting makes it even better.
Also, polls done for networks aren't "all political polls" only a subset. and i agree, fox news polls and such are generally the least reliable (excluding the ridiculousness of internal polling).
but this is hoya talk, not political talk...so let's table that. 365 games is a small sample for a multivariate regression. However, say there are about 40 fouls a game...suddenly the sample is 14,600. Large sample. cmon. follow along. And increasing sample size doesn't really change things once your sample is that big. It's a strong sample, unless you want to claim bias. Running a complex stats program with a sample in the hundreds of thousands or millions is a waste of computing power if you have a large enough random sample already. That's the entire point and beauty of statistics!
The used the smaller sample because it's parsimonious to do. This was a side project, a pet project. You don't get tenure as a prof doing these types of cute studies.
The argument isn't that everyone fouls at the same rate, but that in an individual game, there are the same number of fouls called. Example with random numbers: A game @ the verizon: 50 fouls total. a game @ fedex: 30 fouls called. But evenly, in general. Aggregated to a full season, georgetown will foul more than memphis, but if they faced each other, they'd likely get the same number of fouls called, roughly.
unfortunately, penn appears not to get the journal with this article, so i can't actually read it.
|
|
SFHoya99
Blue & Gray (over 10,000 posts)
Posts: 17,744
|
Post by SFHoya99 on Nov 24, 2009 12:12:52 GMT -5
lol if you think my reply was touchy. That's how it came across. If that wasn't what was intended, I apologize, but you keep saying things like "criticize" and "passing judgement." I criticized one thing: sample size. The rest of the methodology -- I said I'd wait and see. I said nothing specific about the author; just that academic studies are often flawed by a lack of knowledge of the subject. The error was pretty simple. They got 3% of the vote in (or whatever) and called it. They did that constantly, and it was based on reported vote totals, ignoring geographic bias. It was a remarkably amateur screwup that occurred, I suspect, because the quickest to call gets to crow. I'm not claiming anything, until I see the methodology. What I am claiming is that the 14,600 fouls can not be random unless the 365 games you chose were random. Fouls are biased, very simply, by the games they occur in. You're taking 100% of the fouls from a single game and 0% from many, many others. It's not a random sampling of fouls. It's biased by who is playing, who is officiating, possibly by time of year, etc. Especially since some of their research is not on fouls, but the context of fouls within games -- you can't say the sample is 14,600 when you talk about sequences or context. That's not to say you can't adjust for that bias. But it does mean that you better not have 144 of your 365 in the Big East or in November and December. You better look at games that are between closely matched teams and not use blowouts (where any official would admit they put away their whistles for the leading crew). Come on. We're talking about multiplying your data set by ten or twenty. I've seen someone run regressions on 12 million lines of data on SQL in about three hours, coding time included. The time is in data gathering, not processing. And that's not that bad here. I'm sorry -- the level of detail is like something I'd write for Hoya Prospectus (okay, its much better than that, but still), not something I'd issue off to ESPN. Good point. Which is why I'd love to read the original, but can't seem to find it. I'm not saying the study is junk; I'm saying a lot of these "sports studies" are junk, and I don't know about this one. The sample smells fishy to me, though, unless they adjusted for it. And if I were going to get the press I got on this, and there's PBP available widespread online...I use it.
|
|
|
Post by Coast2CoastHoya on Nov 24, 2009 12:39:54 GMT -5
but this is hoya talk, not political talk...so let's table that. um ... i know you're new and all, but have you been to the Blue and Gray board? Nary a better political talk board exists. I get a lot of my news and opinions there!
|
|
|
Post by FromTheBeginning on Nov 24, 2009 12:56:23 GMT -5
Also seems strange that in an article to claim a "home court advantage" the lead example is a neutral court game.
|
|
|
Post by thejerseytornado on Nov 24, 2009 12:58:02 GMT -5
I criticized one thing: sample size. The rest of the methodology -- I said I'd wait and see. I said nothing specific about the author; just that academic studies are often flawed by a lack of knowledge of the subject. a claim that was pretty much disproven by the espn article in which it noted he played college ball (albeit div 3). At the very least, you didn't give much of a chance to the study. Ok, so assume the 365 were chosen as a random representative sample ( a reasonable enough assumption. if they weren't, it'd be the first thing noted in any peer review) of major conference games. boom--random. so what's the criticism/concern? these are all easily controlled for and tested theories of possible bias. Things that wouldn't be in an espn press release but would be in the article itself. is it? And again, 3 hours to run a regression...knowing economists, they like to run really Editeding fancy (unnecessarily so) models with extensive computing time and then simulate and then do it over and over again. For a weekend article/lark, a sample of 14,600 is enough. Would it be better with more? sure. Is that really necessary or a significant concern? nah. I mean, yeah, the data is probably why this is in some random journal of sports science instead of a top journal in ISI. but i'm not sure it's biased because of it. Worth reading their data section more carefully than normal, perhaps. wait, what? first, the authors didn't do that...it was the journal probably that sent anything to espn. Second, to get it onto espn, it has to be interesting to a random sports fan, not someone who knows anything about how to run stats. cmon. you're throwing darts just to throw them. I don't judge articles by the press release about them. Actually, the stuff that gets mainstream press often has the worst data and argument. Sexy ideas sell, not impressive data. sad but true. A lot of the studies might be junk, but i'm not going to judge it till I see it. Though the fact that you and I both can't access it and we both seem to know the field ain't a good sign.
|
|
|
Post by thejerseytornado on Nov 24, 2009 12:58:34 GMT -5
but this is hoya talk, not political talk...so let's table that. um ... i know you're new and all, but have you been to the Blue and Gray board? Nary a better political talk board exists. I get a lot of my news and opinions there! that is, but this is supposed to be bout basketball in this forum ;D
|
|
SFHoya99
Blue & Gray (over 10,000 posts)
Posts: 17,744
|
Post by SFHoya99 on Nov 24, 2009 13:42:55 GMT -5
And that's exactly why I'd want to read the study, if I could. Like I said, intriguing.
If you find it, let me know.
|
|
hifigator
Platinum Hoya (over 5000 posts)
Posts: 6,387
|
Post by hifigator on Nov 24, 2009 14:01:30 GMT -5
SF wrote:
What I was referring to is the ridiculous +- error bars they put on them, making the assumption that they are random. They aren't, and those polls are awful.
It is 100% the reason why the networks screwed up Florida in 2000 -- not understanding that states don't report vote totals randomly, it is geographically and thus demographically slanted, for instance.
I agree that it is part of the reason, but certainly not 100%. The alphabet networks wanted Gore to win. They knew Florida was going to be very close. By calling it early for Gore, they were hopeful that voters in the panhandle -- traditionally a much more conservative base -- would be less likely to get out and vote, since the result was already decided. I still remember the "corrections" that came out -- I want to say around 9 or 9:30 -- where they then said the state was "too close to call." But the most telling footage came around 1 or 2 am, when they finally called the state for Bush. All of the big three anchors -- I think it was Jennings, Brokaw and Rather at the time -- had a somber expression and all looked like their dog had just died. That certainly wasn't "news."
|
|
SFHoya99
Blue & Gray (over 10,000 posts)
Posts: 17,744
|
Post by SFHoya99 on Nov 24, 2009 14:22:44 GMT -5
Sigh.
You can assign motivations all you like, but the anchors aren't making the calls. The thing is, this using a non-random sample as a random sample had been going on for years and years, but 2000 was the first time it really bit them.
I'm not getting into the stupid "media is biased" argument because everyone believes the world is against them. Needless to say, you've presented one of many human motivations (maybe the guy in charge wanted Bush to win and hoped calling for Gore would make Dems say "I don't need to vote? or maybe the network just wanted to be the first to call it).
You're right my reason is not 100% (sue me for hyperbole), but going down the personal motivation road assumes you know:
- Who made the call - Who had influence on the call - How much thought went into it - How ethical those person or people are - What effect on future voters those people think their judgement will have
Eh, I'm just basing it on competence and a rather elementary error that occurs in almost every political polls I see.
|
|
rosslynhoya
Diamond Hoya (over 2500 posts)
Posts: 2,595
|
Post by rosslynhoya on Nov 24, 2009 14:48:30 GMT -5
I'm surprised that ESPN allowed this citation to be included in the article:
"The professors also cited an earlier study that concluded there were more calls against teams ahead in games on national TV versus those ahead in locally televised games. Calling fouls against the leading team tends to keep games closer, the studies said."
Who could possibly be influencing the refs' decision-making in those circumstances? Hmmm?
|
|
whatmaroon
Silver Hoya (over 500 posts)
Posts: 819
|
Post by whatmaroon on Nov 24, 2009 15:58:24 GMT -5
Cite: Anderson, K. & Pierce, D. (2009). Officiating bias: The effect of foul differential on the subsequent fouls in NCAA basketball. Journal of Sport Sciences, 27(7), 687-694. It doesn't seem to be freely available online, but maybe some people have academic subscriptions. Journal of Sports Sciences 27(7) is available at www.informaworld.com/smpp/title~db=all~content=g911454538.
|
|
|
Post by thejerseytornado on Nov 24, 2009 16:08:03 GMT -5
i've requested a copy...i'll report back once i get a pdf of it.
|
|