Open Access Response to HEFCE
HEFCE is currently asking for feedback on the role of Open Access in the next REF. While I have a a number of technical suggestions, I think that the biggest and best contribution that the next HEFCE could make to the next REF is to state pubically that all journal/conference/venue metadata be removed from papers before they are sent for review.
It is time that we stopped judging books by their cover. It would be a fantastic contribution if HEFCE could take a lead on this. This is my full response.
Expectations for Open Access
I feel that one key issue is missing from this document. Scientists still have problems in some areas (including mine of computing science) in that the “high-impact” journals or conferences often provide no or prohibitively expensive open access options. In this past, I have refused to publish in these journals because I wish my work to remain open access and instead published elsewhere. However this works directly against my own interests in the current REF as the research will be judged less good. The use of journals as a primary indicator of quality, also works against my ability to choose cheaper venues. Few people believe statements that research will not be judged on publication venue; indeed, as an individual academic, I have even been told to directly comment on the venue in my return.
One simple and yet enormous contribution that procedures for the next REF could make to Open Access is to not to coerce, but to remove this enormous barrier. This could happen simply and straight-forwardly by removing all journal and publication venue metadata from papers when presented to reviewers. Of course, this reviewers could work around this (the data is a google search away), but the message sent by such a step would be enormous.
The general expectations for OA publishing seem reasonable. However, I think, I would add a further specific requirement. Currently, it is very hard to find the location of a green OA copy of any article. Making articles available is not enough; they must be discoverable. Therefore, I would suggest that a specific requirement that a primary identifier (DOI, ISSN, ISBN or URL) must be present in the institutional repository, and this must be visible on the web page and present in computational metadata. Finally, making the paper discoverable is also not enough. There must be computational and human-readable metadata making clear the contents of the paper are Open Access; without this form of explicit statement, the only safe course of action for readers to take is assume the copyright default position that you cannot use the material.
Institutional Repositories
Despite the significant investment, our experience is that few people ever retrieve data from institutional repositories. Partly, this is because it is difficult to link between articles on a journal website and articles in institutional repositories. As a second problem, institutional repositories provide an inconsistent experience, both for computational and human access. For instance, the presentation of identifiers such as DOIs is inconsistent. Even when present DOIs are often inaccurate, containing syntactic errors, which prevent their usage.
Ultimately, institutional repositories would be much better if there were a single infrastructure maintained at a national level (or international). In fact, a strong exemplar for this already exists in the form of arXiv. The ability to update the could be devolved to individual institutions. An authentication framework for this is already in place through JE-S.
Linking between institutional repositories and subject repositories unfortunately is likely to be difficult from a social perspective; there are many subject repositories and the institutional repositories are not likely to link to them well, because they are not experts in these repositories. This might be more plausible in a single national repository.
The better solution is to enable authors of papers to perform this linking. Scientists who actually care about the links working and being to the correct place are best place do this. This could be supported in the REF, by making linking to data, software or other subject repositories an explicit criteria in REF; this happens in some disciplines (for example, in bioinformatics a clear statement of if and where software is available and under what conditions is often asked for by reviewers).
Approach to Exceptions
If exceptions are to be for a transitional period, then they any exceptions given should be marked with a “sell-by” date, after which they should no longer be considered valid.
It is worth reiterating that embargoes really only benefit the publishers; ensuring that the REF framework allows academics to choose their publication venue more freely, rather than effectively requiring them to publish in selected “high-impact” venues would enable them to choose venues with short, or no embargo period. The most effective mechanism for achieving this would be to remove all publication venue information from future REF returns. The research would be judged on the basis of the research, and not the publication venue.
Open Data
There is more complexity behind the requirement for open data than for open access, particularly where the data needs to remain confidential for reasons of data protection. Having said all of this, there are many disciplines (again bioinformatics is an obvious example) where the majority of data is open. Making a decision now to rule this out of scope, for a REF which may be a significant distance in future seems premature.