Showing posts sorted by relevance for query Type I error. Sort by date Show all posts
Showing posts sorted by relevance for query Type I error. Sort by date Show all posts

Wednesday, December 14, 2011

Cell Phones, Driving, and Decision Errors

I suspect that the NTSB is right that, on net, the costs from distracted driving while talking on a mobile phone outweigh the benefits. But this is a classic example of how decision errors can creep into decision making. Consider the hypothesis that banning mobile phone use while driving will save lives.


Ban mobile phone use while driving Don't ban mobile phone use while driving
Mobile phone use while driving causes more deaths than it saves

0 Type I Error
(false negative)
Mobile phone use while driving saves more deaths than it causes Type II Error
(false positive)
0

Type I Error Costs If distracted drivers only killed themselves, why stop them? The hidden cost is that they typically run into someone else or they take their passengers with them. Drivers may be less careful with other peoples' lives. So, the Type I error costs are all the innocents killed by distracted drivers. This has been the focus of most of the studies and the policy.

Type II Error Costs Drivers with mobile phones have related information during Amber alerts. They have notified media outlets when traffic was bad due to wrecks and thus saved thousands of hours in commute time - during which some wreck could have occurred. They have been able to coordinate with their called parties, getting directions, etc., which saves time which has some value. And of course, they derive utility from the call. But if your job is highway safety, as board member of NTSB, how much do you value these? Since no one writes news stories about the wreck that did not occur from accurate traffic updates, they are largely hidden. Also, if your job is highway safety, you may not value lost time and utility as much as those who must give it up.

Let me repeat that I suspect that banning mobile phone use while driving is likely appropriate. I am just not certain that all the costs went into the decision.

Friday, July 12, 2019

Does decreasing Type II errors leads to more Type I errors?

TYPE I ERROR:  False prosecution
TYPE II ERROR:  False non-prosecution

Until 2011, Title IX (the law that prohibits sex discrimination at federally funded schools) was rarely enforced because of a strict standard of proof. That changed under President Obama.  With lower standards of proof and evidence rules that favored prosecution, we should see Type II errors fall, but with a corresponding increase in Type I errors.
Defamation claims are the new legal tool for men to clear their name and get their accuser to drop sexual assault complaints, according to legal experts. The defamation cases usually end in settlements.

“Over the last three and half years, there’s been far more legal action brought by men charged by the institution with a sexual assault violation,” said Saunie Schuster, a lawyer who advises a range of colleges and co-founded the Association of Title IX Administrators. “The trend was for them to file an action against the institution for due process, but along the way, we started seeing them not just going to file action against the institution, but also civil actions against the victims.”

It is really hard to tell whether Type I or Type II errors changed by court filings, as the selection of cases for trial is not random.  In cases like this, the only recourse is to theory, like that taught in statistics class, showing that reducing one type of error causes an increase in the other.

Wednesday, January 6, 2021

Are the FDA's incentives aligned with the goals of the people?

The FDA is resisting pressure from academics to vaccinate more people with single doses:
"At this time, suggesting changes to the FDA-authorized dosing or schedules of these vaccines is premature and not rooted solidly in the available evidence," Dr. Stephen Hahn, FDA commissioner, and Dr. Peter Marks, director of the FDA's Center for Biologics Evaluation and Research, said in a statement. "Without appropriate data supporting such changes in vaccine administration, we run a significant risk of placing public health at risk."

Apparently, these FDA bureaucrats want more information before making a decision. 

However, we know how to make decisions under uncertainty: minimize expected error costs or maximize expected value. The expected benefit of one-dose regime (vaccinating twice as many people with a half dose) is millions of lives.
The simplest argument for First Doses First (FDF) is that 2*0.8>.95, i.e. two vaccinated people confers more immunity than one double vaccinated person. But there is more to it than that. Perhaps more important is that with FDF we will lower R more quickly and reach herd immunity sooner. 
Here’s an extreme but telling example. Suppose you have a pop of 300 million, need 2/3 to get to herd immunity and you have 100m doses and can vaccinate 100m a month. Then with FDF you vaccinate 100m in first month and a new 100m in the second month and then you are “done.” i.e. you can then do 2nd doses more or less at leisure since you are at herd immunity (yes, I know about overshooting, this is a simple example). If instead you do second doses you vaccinate 100m in first month and the same 100m in the second month which leaves 100 million at risk for another month. Under second doses you don’t reach herd immunity until the third month. Thus, under FDF you save a 100m infection-month which is a big deal.

The FDA has a long and sorry history of delaying medical innovation because type I errors (doing something that turns out to be wrong) are visible and type II errors are not (not doing something that turns out to be right). These bureaucrats seem to be putting their own interests ahead of those of the people they are supposed to protect.

This is a well-known incentive problem:  unless we evaluate bureaucrats based on expected value, not on whether they commit Type I errors, they will respond accordingly.

Thursday, September 23, 2021

Big data and the curse of dimensionality

I just finished a fabulous book, Everybody Lies, written by Seth Stephens-Davidowitz.  From the Amazon description of the book:
Everybody Lies offers fascinating, surprising, and sometimes laugh-out-loud insights into everything from economics to ethics to sports to race to sex, gender and more, all drawn from the world of big data. What percentage of white voters didn’t vote for Barack Obama because he’s black? Does where you go to school effect how successful you are in life? Do parents secretly favor boy children over girls? Do violent films affect the crime rate? Can you beat the stock market? How regularly do we lie about our sex lives and who’s more self-conscious about sex, men or women?

I particularly liked the metaphors that Stephens-Davidowitz uses to describe his results.  For example,  in describing why it is easy to come up with variables that correlate with the stock market, but hard to find ones that can make accurate predictions, he uses the metaphor of coin flipping:

Suppose your strategy for predicting the stock market is to find a lucky coin -- but one that will be found through careful testing. Here's your methodology: You label one thousand coins - 1 to 1,000. Every morning, for two years, you flip each coin, record whether it came up heads or tails, and then note whether the Standard & Poor's Index went up or down that day. You pore through all your data. And voila! You've found something. It turns out that 70.3 percent of the time when Coin 391 came up heads the S&P Index rose. The relationship is statistically significant! Highly so! You have found your lucky coin! 
Just flip Coin 391 every morning and buy stocks whenever it comes up heads. Your days of Target T-shirts and ramen noodle dinners are over. Coin 391 is your ticket to the good life!

Every statistics user should know that when running 1000 hypothesis tests, on average 50 of them will show statistically significant results, even when there is no relationship.  This is the size of Type I error (5%) in classical hypothesis testing.

Instead, split your sample in two and use half the data to "find" (estimate) one lucky coin; and the other half to test it.

BOTTOM LINE:  the more tests you run, the more likely it is that at least one of them will show a statistically significant relationship, even if there is none.  This is likely behind what has become known as the replication crisis, that has hit the field of psychology particularly hard as only one third of the results from the most cited articles could be replicated.  It is likely that academics are testing lots of hypotheses and publishing the few that turn out to be statistically significant.  This is analogous to finding a lucky coin, as it only appears to be lucky.  Once you test it outside the sample, the luck disappears.

TRUTH IN BLOGGING:  the field of economics has its own replication crisis, only two thirds of top results could be replicated.

Thursday, November 1, 2018

Taylor Swift does Revenue Management

When you price to fill a venue with a fixed capacity, there are two mistakes you can make:

  • Type I error: You can price too low, and have excess demand
  • Type II error:  You can price too high, and have empty seats

An optimal strategy would choose a price that sets expected demand to capacity, but "shaded" high or low, depending on the relative size expected costs of over and under pricing.  In other words, if the expected costs of under pricing are bigger than the expected costs of over pricing, then price a little higher than the target price where capacity equals expected demand.

Reducing uncertainty, means that you can more accurately price to match demand to capacity (you shade less).  Some middling economists have written on how hotel mergers reduce uncertainty, and allow the merged hotel to price more accurately.  With fewer over pricing errors, occupancy goes up. 
Kalnins, Arturs and Froeb, Luke M. and Tschantz, Steven T., Mergers Increase Output When Firms Compete by Managing Revenue. Vanderbilt Law and Economics Research Paper No. 10-27. Available at SSRN: https://ssrn.com/abstract=1670278 or http://dx.doi.org/10.2139/ssrn.1670278


If you went on Ticketmaster in January and pulled up a third-row seat for Taylor Swift‘s June 2nd show at Chicago’s Soldier Field, it would have cost you $995. But if you looked up the same seat three months later, the price would have been $595. That’s because Swift has adopted “dynamic pricing,” where concert tickets – like airline seats – shift prices constantly in adjusting to market demand. It’s a move intended to squeeze out the secondary-ticket market – but it’s also left many fans confused as they’re asked to pay hundreds of dollars more than face value. “Basically, Ticketmaster is operating as StubHub,” says one concert-business source.
The problem, of course, is that by dynamic pricing, concert goers have an incentive to "game" the dynamic pricing, by waiting until the last minute to book seats.