tag:blogger.com,1999:blog-13265058.post5715534316768828404..comments2014-10-06T15:27:42.602+11:00Comments on Sydney Oracle Lab: Why we learn mathsGary Myersnoreply@blogger.comBlogger8125tag:blogger.com,1999:blog-13265058.post-1140998589662759872011-05-11T15:15:05.579+10:002011-05-11T15:15:05.579+10:00Chen:
I did not say that p-values are calculated o...Chen:<br />I did not say that p-values are calculated on the sample as a whole.<br />Like I said: google the term. It's worth it.Noonshttp://www.blogger.com/profile/04285930853937157148noreply@blogger.comtag:blogger.com,1999:blog-13265058.post-30710753749448188042011-05-08T16:53:04.983+10:002011-05-08T16:53:04.983+10:00@Gwen,
Thanks. I've done a follow up about th...@Gwen,<br /><br />Thanks. I've done a follow up about the death of delete. I'll have to stretch my maths out to come up with a more extensive description of where update concurrency starts to break things.Gary Myershttp://www.blogger.com/profile/10404756950638119562noreply@blogger.comtag:blogger.com,1999:blog-13265058.post-49983416309532048472011-05-08T16:51:25.476+10:002011-05-08T16:51:25.476+10:00@Hemant
Happy for the UI (or a mid-tier layer) to...@Hemant<br /><br />Happy for the UI (or a mid-tier layer) to calc an average as TOTAL/COUNT and maybe show "AVG 6.5 from 12 reviews" or whatever. <br /><br />Agree that small number of samples has little meaning. More so if people are volunteering opinions rather than a really random sample.<br /><br />DBMS_STATS determining averages/min/max from sampling would make an interesting post. But I don't think my maths is up to it. Perhaps Craig Shallahamer might take it up.Gary Myershttp://www.blogger.com/profile/10404756950638119562noreply@blogger.comtag:blogger.com,1999:blog-13265058.post-8271065415099111932011-05-08T16:45:57.812+10:002011-05-08T16:45:57.812+10:00@Chet. Performance should ALWAYS be a consideratio...@Chet. Performance should ALWAYS be a consideration. General rule of thumb would be where the work involved in maintaining the summary is outweighed by the work saved by not recalculating for each query.<br /><br />But that is pretty vague as the former includes work in development / maintenance rather than just work by the DB engine.<br /><br />Volume of data, rapidity of change, consistency all come into play.Gary Myershttp://www.blogger.com/profile/10404756950638119562noreply@blogger.comtag:blogger.com,1999:blog-13265058.post-24649232317984677672011-05-07T09:12:52.711+10:002011-05-07T09:12:52.711+10:00Excellent post. Especially the point about the que...Excellent post. Especially the point about the queues, I've been trying to get people to do this forever.<br /><br />Since its a bit of a non-DBA or noob-friendly post, I think more of an explanation on why inserts are not a concurrency issue but updates could help some readers. I know why, but I'm not sure its completely intuitive.<br /><br />And noons:<br /><br />P-value is the probability that if what you want to say about your data is wrong, you still got the results you did in your experiment by pure chance.<br />The lower the probability of getting your results by chance even if you are wrong, the more significant your results are.<br /><br />and as far as I know - p-values are calculated on analyzed data (averages and variance), not on the sample as a whole.Chen Shapirahttp://www.blogger.com/profile/14535067086703072776noreply@blogger.comtag:blogger.com,1999:blog-13265058.post-40890629865073499282011-05-06T18:57:59.196+10:002011-05-06T18:57:59.196+10:00Good points, folks.
One of the things I had to ...Good points, folks. <br /><br />One of the things I had to get my head around during the Statistics and Probability semester in uni was the notion of "p-values". Goggle it. <br /><br />In a nutshell: they show us the degree of probability that a given sample is significant before we publish any ratios based on that sample. <br /><br />Rarely calculated nowadays, where simply doing a ratio is considered by newspapers as a "statistic".Noonshttp://www.blogger.com/profile/04285930853937157148noreply@blogger.comtag:blogger.com,1999:blog-13265058.post-82200225909334624942011-05-06T12:02:32.419+10:002011-05-06T12:02:32.419+10:00"I've got 10 reviews with a total rating ..."I've got 10 reviews with a total rating of 70" (for "Thor")<br /><br />But what if "Fast and Furious" has a total rating of 40 but only 5 reviewers ?<br /><br />Would people look at "70 and 10" versus "40 and 5" (i.e. a pair of figures each) when comparing the ratings for two movies ? Or would it be easier to look at "7" and "8" ? If you present two figures ("70" and "10") for the same movie, most people would read only 1 figure ("70").<br /><br />Then, again if only 1 person has given a rating of "10" to "Source Code", should we accept that this film has an average rating of "10" ?<br /><br />So, an average makes sense only after a minimum (threshold) number of ratings have been entered.<br /><br />If a movie has not been rated by at least 10 people, I would not present the average (or total) rating score at all.Hemant K Chitalehttp://www.blogger.com/profile/07369112096230549250noreply@blogger.comtag:blogger.com,1999:blog-13265058.post-87954673046719367432011-05-06T01:33:03.305+10:002011-05-06T01:33:03.305+10:00I've done something similar to the movie ratin...I've done something similar to the movie ratings, only it's financial transactions.<br /><br />On the account line, store Current Balance and any other things pertinent.<br /><br />From a purist perpective, what if performance were never a consideration? Maybe a small data set or you have something like Exadata? Would you then go the Parent/Child way and roll it up each time it was looked up?oraclenerdhttp://www.blogger.com/profile/12412013306950057961noreply@blogger.com