Video Of Day

Breaking News

Secret Information Encore

My post "Secret data" on replication provoked a lot of comment in addition to emails,  more reflection, in addition to to a greater extent than or less additional links.

This isn't nearly rules

Many of my correspondents missed my principal indicate -- I am non advocating to a greater extent than in addition to tighter rules yesteryear journals! This is non nearly what you lot are "allowed to do," how to "get published" in addition to thus forth.

In fact, this extra rumination points me fifty-fifty to a greater extent than strongly to the persuasion that rules in addition to censorship yesteryear themselves volition non work. How to brand enquiry transparent, replicable, extendable, in addition to thus forth varies yesteryear the form of work, the form of data, in addition to is patch of study similar everything else to inventiveness in addition to technical improvement.  Most of all, it volition non piece of work if nobody cares; if nobody takes the form of actions inwards bullet points of my final post, in addition to it's just an number nearly rules at journals. Already, (more below) rules are non that good followed.

This isn't just nearly "replication." 

"Replication" is much also narrow a word. Yes, many papers direct maintain non documented transparently what they genuinely did, thus that fifty-fifty armed with the information it's hard to orbit the same numbers. Other papers are based on cloak-and-dagger data, the job with which I started.

But inwards the end, most of import results are non merely due to outright errors inwards information or coding. (I hope!)

The of import number is whether small-scale changes inwards instruments, controls, information sample, measure error handling, in addition to thus forth orbit dissimilar results, whether results grip out of sample, or whether collecting or recoding information produces the same conclusions. "Robustness" is a ameliorate overall descriptor for the job that many of us suspect pervades empirical economical research.

You demand replicability inwards gild to evaluate robustness -- if you lot larn a dissimilar outcome than the master authors', it's essential to move able to rails downwards how the master authors got their result. But the existent number is that much larger one.

The first-class replication wiki (many expert links) quotes Daniel Hamermesh on this departure betwixt "narrow" in addition to "wide" replication
Narrow, or pure, replication way starting fourth dimension checking the submitted information against the primary sources (when applicable) for consistency in addition to accuracy. Second the tables in addition to charts are replicated using the procedures described inwards the empirical article. The aim is to confirm the accuracy of published results given the information in addition to analytical procedures that the authors write to direct maintain used. 
Replication inwards a broad feel is to consider the empirical finding of the master newspaper yesteryear using either novel information from other fourth dimension periods or regions, or yesteryear using novel methods, e.g., other specifications. Studies with major extensions, novel information or novel empirical methods are oftentimes called reproductions.
But the to a greater extent than of import robustness inquiry is to a greater extent than controversial. The master authors tin complain they don't similar the replicator's alternative of instruments, or procedures. So "replication," which sounds straightforward, chop-chop turns inwards to controversies.

Michael Clemens writes nearly the number inwards a weblog post here, noting
...Again in addition to again, the master authors direct maintain protested that the critique of their piece of work got dissimilar results yesteryear construction, non because anything was objectively wrong nearly the master work. (See Berkeley’s Ted Miguel et al. secret information post would move read equally criticism of people who orbit large-data work, proprietary-data work, or piece of work with regime agencies that cannot currently move shared.  The network is pretty snarky, thus it's worth stating explicitly that is non my intent or my view.

Quite the opposite. I am a huge fan of the pioneering piece of work exploiting novel information sets. If these pioneers had non flora dramatic results in addition to possibilities with novel data, it would non affair whether nosotros tin replicate, banking company check or extend those results.

It is only now, that the pioneers direct maintain shown the way, that nosotros know how of import the piece of work tin be, that it becomes vital to rethink how nosotros orbit this form of piece of work going forward.

The special problems of confidential regime data

The regime has a lot of slap-up information -- IRS, in addition to census for microeconomics, SEC, CFTC, Fed, fiscal production security commission inwards finance. And at that spot are obvious reasons why thus far it has non been easily shared.

Journal policies allow exceptions for such data. So only a primal demand from the balance of us for transparency tin select nearly changes. And has begun to orbit so.

In improver to the suggestions inwards the final post, to a greater extent than in addition to to a greater extent than people are going through the vetting to work the data. That leaves opened upwardly the possibility that a total replication machine could move stored on site, ready for a replicator with proper access to force a button. Commercial information vendors could allow similar "free" replication, controlling straight how replicators work the data.

Technological solutions are on the way too.  "Differential privacy" is an illustration of a engineering scientific discipline that allows results to move replicated without compromising the privacy of the data. Leapyear.io is an illustration of companies selling this form of technology. We are non alone, equally at that spot is a rigid commercial demand for this form of data. (Medical information for example.)

Other institutions: Journals, replication journals, websites,

There is to a greater extent than or less debate whether checking "replication" should count equally novel research, in addition to I argued if nosotros desire replication nosotros demand to value it. The larger robustness inquiry surely is "new" research. Xs outcome does non grip out of sample, is sensitive to the precise alternative of instruments in addition to controls, in addition to thus forth, is genuine, publishable, follow-on research.

I originally opined that replications should move published yesteryear the master periodical to give the best incentives. That way an AER replication "counts" equally an AER publication.

But with the thought that robustness is the wider issue, I am less inclined to this view. This broader robustness or reexamination is genuine novel research, in addition to at that spot is a continuum betwixt replication in addition to the normal job concern of examining the basic thought of a model with novel information in addition to also to a greater extent than or less novel methods. Each newspaper on the permanent income hypothesis is non a "replication" of Friedman! We don't desire to only value equally "new" enquiry that which uses novel methods -- thus nosotros larn dry out methodologists, non fact-oriented economists. And ane time a newspaper goes beyond pointing out elementary mistakes, to questioning specification, a inquiry which itself tin move rebutted, it's beyond the responsibleness of the master journal.

Ivo Welch argues that a 3rd of each periodical should move devoted to replication in addition to critique.  The Critical Finance Review, which he edits asks for replication papers.  The Journal of Applied Econometrics has a replication section, in addition to straightaway invites replications of papers inwards many other journals. Where journals fearfulness to tread, other institutions stair in. The replication network is ane interesting novel resource.

Faculties

H5N1 correspondent suggests an of import additional bullet indicate for the "what tin nosotros do" list

  • Encourage your faculty to adopt a replicability policy equally component of its standards of conduct, in addition to equally component of its standards for internal in addition to exterior promotions. 

The precise wording of such standards should move fairly loose. The of import thing is to ship a message. Faculty are expected to brand their enquiry transparent in addition to replicable, to render information in addition to programs, fifty-fifty when journals orbit non require it.  Faculty upwardly for advertisement should facial expression that the commission reviewing them volition facial expression to run into if they are behaving reasonably. Failure volition probable Pb to a piffling chat from your subdivision chair or dean. And the policy should state that replication in addition to robustness piece of work is valued.

Another correspondent wrote that he/she advises junior faculty not to post programs in addition to data, thus that they orbit non larn a "target" for replicators. To say nosotros disagree on this is an understatement. H5N1 clear vocalisation on this number is an first-class number of crafting a written policy.

From Michael Kiley's first-class comment below

  • Assign replication exercises to your students. Assign robustness checks to your to a greater extent than advanced students. Advanced undergraduate in addition to PhD students are a natural reservoir of replicators. Seeing the nuts in addition to bolts of how good, transparent, replicable piece of work is done volition orbit goodness them. Seeing that non everything published is replicable or right mightiness orbit goodness them fifty-fifty more.   

Two expert surveys of replications (as good equally journals) 

Maren Duvendack, Richard  Palmer-Jones, in addition to Bob Reed direct maintain an first-class survey article, "Replications inwards Economics: H5N1 Progress Report"
...a survey of replication policies at all 333 economic science journals listed inwards Web of Science. Further, nosotros analyse a collection of 162 replication studies published inwards peer-reviewed economic science journals. 
The latter is peculiarly good, starting at p. 175. You tin run into hither that "replication" goes beyond just can-we-get-the-author's-numbers, in addition to maddeningly oftentimes does non fifty-fifty enquire that question
 a piffling less than two-thirds of all published replication studies endeavour to just reproduce the master findings....A frequent ground for non attempting to just reproduce an master study’s findings is that a replicator attempts to confirm an master study’s findings yesteryear using a dissimilar information set
"Robustness" non "replication "
Original Results?, tells whether the replication study re-reports the master results inwards a way that facilitates comparing with the master study. H5N1 large portion of replication studies orbit non offering slowly comparisons, possibly because of express periodical space. Sometimes the lack of direct comparing is to a greater extent than than a nestling inconvenience, equally when a replication study refers to results from an master study without identifying the tabular array or regression number from which the results come.
Replicators demand to move replicable in addition to transparent too!
Across all categories of journals in addition to studies, 127 of 162 (78%) replication studies disconfirm a major finding from the master study. 
But rather than just the green alarmist headline, they direct maintain a expert insight. Replication studies tin endure the same significance bias equally master work:
Interpretation of this number is difficult. One cannot assume that the studies treated to replication are a random sample. Also, researchers who confirm the results of master studies may human face upwardly difficulty inwards getting their results published since they direct maintain cipher ‘new’ to report. On the other hand, periodical editors are loath to scandalize influential researchers or editors at other journals. The Journal of Economic & Social Measurement in addition to Econ Journal Watch direct maintain sometimes allowed replicating authors to written report on their (prior) difficulties inwards getting disconfirming results published. Such firsthand accounts particular the reticence of to a greater extent than or less periodical editors to let out disconfirming replication studies (see, e.g., Davis 2007; Jong-A-Pin in addition to de Haan 2008, 57).
Summarizing
.. nearly lxxx pct of replication studies direct maintain flora major flaws inwards the master research
Sven Vlaeminck in addition to Lisa-Kristin Hermmann surveyed journals in addition to written report that many journals with information policies are non enforcing them. 
The results nosotros obtained advise that information availability in addition to replicable enquiry are non with the superlative priorities of many of the journals surveyed. For instance, nosotros flora 10 journals (i.e. 20.4% of all journals with such policies) where non a unmarried article was equipped with the underlying enquiry data. But fifty-fifty beyond these journals, many editorial offices orbit non genuinely enforce information availability: There was only a unmarried periodical (American Economic Journal: Applied Economics) which has information in addition to code available for every article inwards the iv issues. 
Again, this observation reinforces my indicate that rules volition non substitute for people caring nearly it. (They also utter over technological aspects of replication, in addition to the impermanence in addition to obscurity of zip files posted on periodical websites.) 

Numerical Analysis

Ken Judd wrote to me,
"Your advocacy of authors giving away their code is non the dominion inwards numerical analysis. I indicate to the “market test”: the numerical analysis community has done an first-class project inwards advancing computational methods despite the lack of whatever requirement to part the code....
Would you lot require Tom Doan to give out the code for RATS? If not, thus why orbit you lot advocate journals forcing me to freely distribute my code?...
The number is non replication, which just way that my code gives the same respond on your estimator equally it does on mine. The number is verification, which is the work of tests to verify the accuracy of the answers. That I am willing to provide."
Ken is I yell upwardly reading to a greater extent than "rule in addition to censorship" rather than "social norms" inwards my views. And I yell upwardly it reinforces my preference for the latter over the former.  Among other things, rules designed for ane piece of work (extensive statistical analysis of large information sets) are poorly adapted to other situations (extensive numerical analysis.)

Rules tin move taken to extremes.  Nobody is talking nearly "requiring" packet customers to distribute the (proprietary) packet source code. We all empathize that stair is non needed.

For heavy numerical analysis papers, using author-designed software that the writer wants to market, the verification proffer seems a sensible social norm to me.  If I'm refereeing a newspaper with a heavy numerical component, I would move happy to run into the extensive verification, in addition to happier all the same if I could work the programme on a few essay cases of my own. Seeing the source code would non move necessary or fifty-fifty that useful. Perhaps inwards extremis, if a verification failed, I would desire the right to contact the writer in addition to empathize why his/her code produces a dissimilar result.

Some other examples of "replication" (really robustness) controversies:

Andrew Gelman covers a replication controversy, inwards which Douglas Campbell in addition to Ju Hyun Pun dissect Enrico Spolaore in addition to Romain Wacziarg's "the Diffusion of Development" inwards the QJE. There is no accuse that the estimator programs were wrong, or that ane cannot orbit the published numbers. The disputation is solely over specification, that the outcome is sensitive to specification in addition to controls.

Yakov Amihud in addition to Stoyan Stoyanov Do Staggered Boards Harm Shareholders? reexamine Alma Cohen in addition to Charles Wang's Journal of Financial Economics paper. They come upwardly to the contrary conclusion, but could only reexamine the number because Cohen in addition to Wang shared their data. Again, the issues, equally far equally I tin tell, are non a accuse that programs or information are wrong.

Update: Yakov corrects me:

  1. We orbit non come upwardly to "the contrary conclusion". We just cannot decline the zero that staggered board is harmless to theater value, using Cohen-Wang's experiment. 
  2. Our outcome is also obtained using the publicly-available ISS database (formerly RiskMetrics). 
  3. Why is the departure betwixt the results? We used CRSP information in addition to did non include a few delisted (penny) stocks that are inwards Cohen-Wang's sample. Our newspaper states which stocks were omitted in addition to why. We are re-writing the newspaper straightaway with to a greater extent than detailed analysis.

I yell upwardly the indicate that replication slides inwards to robustness which is to a greater extent than of import in addition to to a greater extent than contentious remains clear.

Asset pricing is peculiarly vulnerable to results that orbit non grip out of sample, inwards particular the powerfulness to forecast returns. Campbell Harvey has a number of expert papers on this topic.  Here, the number is over again non that the numbers are wrong, but that many expert in-sample return-forecasting tricks halt working out of sample. To know, you lot direct maintain to direct maintain the data.

No comments