Alessandra, Eric: You have been chairing the ISSTA 2016 Artifact Evaluation Committee. What exactly do you do?

ISSTA 2016 allows authors of accepted research papers to submit artifacts. An artifact can be any kind of content related to a paper, e.g., detailed experimental data, complete experimental setup, test suites, or tools. Submitting an artifact is optional but highly encouraged. Our job is to assess such artifacts for whether they support the stated results, and whether they help in reproducing and/or extending the work.

What happens with my artifact after I submitted it?

Artifacts are being reviewed by at least two members of a dedicated Artifact Evaluation Committee with respect to the following criteria (if applicable):

  • How easy is it to use the provided artifact? (Easy to reuse)
  • Does the artifact allow one to reproduce the results stated in the paper? (Consistent)
  • What is the fraction of the results that can be reproduced? (Complete)
  • Does the artifact describe and demonstrate how to apply the presented method to a new input? (Well documented)

If all members agree on the value of an artifact, the respective paper receives the Artifact Evaluation Award.

Does it help the paper getting accepted if it has an artifact?

For the first time, ISSTA further allowed authors of any submitted paper to strengthen the argument for their paper by submitting an artifact ahead of time, i.e., before the PC meeting takes place. During the paper review process, if the results were not convincing enough, reviewers could ask for additional support as part of the author’s response. Such support could come as an artifact, which would then get an extra evaluation from the committee. Their report would provide additional evidence that the results are as stated.

Does it hurt a paper if there is no artifact? Or if the artifact does not work?

At this point, we want to encourage authors to submit artifacts, so there’s no potential penalties for the paper that could follow from submitting an artifact. Authors of course get feedback from the evaluation committee, which should help them in making improvements.

For me as an author, what’s the benefit in having my artifact evaluated?

Your paper gets extra recognition. In the printed proceedings, your paper will appear with a special mark noting that your artifact has been evaluated successfully. This tells readers that they can put additional trust in your results, and that your artifact may be useful to them. Both of these increase the impact of your work. Also, at the conference, we plan to set aside some extra time to promote papers with great artifacts.

For me as a reader, what’s the benefit in seeing an artifact evaluated?

If the artifact is well evaluated, you can put additional trust in the author’s results, and the chances that the artifact may be useful to you is higher. Artifacts are usually available as open source and can be built upon. This greatly aids reproducibility of results and extension by others.

What’s it like to evaluate an artifact, compared to reviewing a paper?

The work is in fact very different and requires significantly more time. When evaluating an artifact one must not only understand the author’s paper and experiments. On top of this one must also understand the author’s tools and data sets, at the very least from a user’s point of view. Due to a frequent lack of proper documentation, this can be challenging. One goal of the artifact evaluation is therefore to force authors to some extent to document their artifacts well. The goal of an artifact evaluation is first of all to test any provided tools on the provided data sets to reproduce the paper’s results. A second goal, however, is to also test the tooling on novel, unanticipated inputs to judge its stability. This is important to know for others who might like to use the tool for their own research.

Any experiences you’d like to share?

One of the things that causes us trouble was the lack of a proper online reviewing system for this task. We required a setup where paper reviewers were doing double-blind reviews and should have been able to see the results of artifact evaluation, whereas artifact reviewers were performing a single-blind review (i.e., authors were known to those reviewers) and required access to paper reviews. No current conference management system supports such a scenario. We ended up using two separate instances of EasyChair, which resulted in a lot of manual labor, though, for everyone involved. If we want to continue this scheme, this should be improved in the upcoming years.

In the long run, should every paper come with an artifact that allows assessment of results?

It seems like we as a community need to decide how far we wish to take artifact evaluation. In other communities, for instance the natural sciences, a research result only counts as a result once it has been reproduced by multiple parties. We believe that Computer Science should follow this path, but this requires that artifacts be made available. Some of us thus argue that artifact evaluation should be mandatory, and that insufficiently stable artifacts should also prevent the respective paper from being published, as its results cannot be relied upon. For ISSTA’16, the PC decided to not yet go down this road. In order to encourage authors to continue submitting artifacts in the following years to come, it was decided to instead have artifact evaluation only positively influence paper acceptance. The both of us think that this should change for upcoming iterations.

Thanks a lot for your work!

Find more interviews behind the link.