Re-interpretation of ‘ influence weight ’ as a citation-based Index of New Knowledge ( INK )

A method proposed in 1976 by F. Narin and collaborators for assessing the ‘influence weight’ of journals is re-interpreted as a potential citation-based indicator of the impact of scientific and other publications which would allow for comparisons between publication types (e.g. regular articles, notes, books) and disciplines, whose practitioners typically exhibit radically different citation behaviors. This Index of New Knowledge (INK) is a dimensionless fraction, with the number of references in the paper to be evaluated + 1 as the denominator, and the number of citations accumulated during a certain period after publication as the numerator. Application examples are provided, covering different disciplines, and the pros and cons of INK are discussed and contrasted with the widely-used journal impact factor.


INTRODUCTION
Several indicators of productivity or quality of scientific publication have been proposed, which correlate with different aspects of productivity and quality.The first of these indicators, overall citation counts ('citedness'), will remain with us as an index of the overall scientific impact (Garfield 1977(Garfield -1993)), while secondorder indicators can be used to explore subtly-grained aspects of this impact.One example of such a secondorder indicator is the recently proposed h-index (Hirsch 2005), which is rapidly ascending in the esteem of practitioners (Ball 2007) despite some shortcomings (Bornmann & Daniel 2007).Other examples are listed at www.harzing.com,where the emphasis is on indicators that are computed from the output of Google Scholar (GS), as opposed to those of the Web of Science (see also Pauly & Stergiou 2005).
A well-known second-order indicator is the journal impact factor (JIF), 'a measure of the degree to which papers in a particular journal are cited in the literature' (Garfield 1990, p. 141), or more precisely, for a given journal and year, 'the average number of times a paper published in the previous two years was cited during the year in question.For example, the 2006 JIF is the average number of times a paper published in 2004 or 2005 was cited in 2006' (Rossner et al. 2007(Rossner et al. , p. 1091).The JIF is convenient for university administrators and science managers because it allows what seems to be objective quantification of research output.The JIF, on the other hand, has been heavily criticized by scientists, especially in universities, because their promotions depend not only on the number (and eventual citedness) of their papers, but on the JIF of the journals in which they are published (see e.g.Kokko & Sutherland 1999).The following weaknesses of the JIF are often mentioned in this context: (1) There are major differences in citation styles between articles in different fields (e.g articles in ecology cite a lot of early papers, but articles in mathematics do not).This results in large differences in the mean number of citations that papers (and hence also journals) can get, irrespective of their impact (or standing) in their respective fields (Kokko & Sutherland 1999); (2) Because the JIF is computed as a fraction, with the number of articles in the denominator, what is counted as 'articles' has become an issue.Thus, in a journal, news items and notes are not counted as arti-cles, even though they may be cited and end up boosting the numerator of the JIF.Also, citable items may be miscategorized, and thus raise fairness or bias issues; (3) Counted articles may differ widely in their purpose and hence in their style.They may range from short, specialized articles, which typically get zero, or few citations, to comprehensive reviews, which typically get more citations, if only because by including a lot of references they allow subsequent author(s) to dispense with extensive reference lists (Garfield 1989).Thus, journals that publish mainly or only reviews will tend to have a high JIF, even though they may not be 'better' than other journals.Indeed, 'top 10' lists, when established using a number of indicators, generally do not include journals that publish only reviews.

INDEX OF NEW KNOWLEDGE (INK)
The 3 criticisms of the JIF presented above are largely due to the unstated assumption that articles are the proper 'units' of scientometrics.There is, however, an alternative: we could instead concentrate on the references that articles contain.This would be analogous to evolutionary biology, which has long concentrated on organisms and their populations, although it is actually the genes they contain which are selected for or against, and which are thus the proper 'units' of evolution (Dawkins 2006).Narin et al. (1976, p. 31) proposed to estimate the 'influence weight' of a journal from the ratio: 'Total number of citations to the journal from other journals/ Total number of references from the journal to other journals'.These authors, however, did not extend the application of similar ratios to individual papers.Similarly, the INK is based entirely on 2 types of knowledge 'units': (1) The ('old') references which are listed in a paper, plus one.This assumes that a paper, by its very existence, cites itself, and more importantly, guarantees that all papers will generate a non-zero denominator for the fraction presented below; (2) The ('new') citations which the paper garners over a certain time period (discussed below).Thus, the new index is defined as: where 'references' are explicit referential items in the bibliography, or in the footnotes of a document to be assessed.This definition implies that the value of the INK can range from 0 (no new knowledge is generated, and thus the ink spent is more valuable than the final product) to many thousands, e.g. the famous article by Lowry et al. (1951), which has over 300 000 citations for 25 references (i.e.INK = 11 538, INK yr -1 = 202).This definition, it will be noted, allows the computation of INK for any type of publication, whether short note, article, review, books or journal issues, or indeed, non-academic publications and websites.The INK index, when computed for journals, thus addresses the 3 criticisms of the JIF, as follows: (1) The INK index standardizes between fields with different citation styles because in fields in which the articles have fewer references (which lowers the denominator), they will also, as a result, have fewer citations (lower numerators), other things being equal (e.g.selfcitation behavior).The INK index, moreover, performs this without prior definition of what the field in question covers (as required for any between-field standardization of the JIF; see Kokko & Sutherland 1999); (2) The INK, as the 'influence weights' from which it is derived, is computed as a ratio whose denominator and numerator are both knowledge 'units', i.e. references and citations.This makes it dimensionless, which is always advantageous for an indicator; (3) The INK index is capable of comparisons between published pieces of various types (e.g.short communications, regular articles or reviews).In principle, this should even allow for comparing across entire domains of scholarship (i.e. the Sciences vs. the Humanities).
It is useful to have an idea of the range of average values that the INK is likely to have.The INK of the average cited paper can be estimated as follows.Between 1900 and 2005, about 20 million papers have been cited at least once.Based on Garfield (2006) there were roughly 365 million citations to these papers, as estimated by multiplying the median of the citation frequency by the number of papers.Assuming that the average cited paper uses ca.30 to 50 references, then the corresponding INK would range from 0.61 to 0.37.This amounts to 0.012-0.007INK units per year if the average article is 50 yr old.These values are higher than the average for all papers because they do not account for the large fraction of papers that are never cited (INK = 0).
Table 1 gives estimated INK yr -1 values for selected disciplines, for different types of publications, drawn mainly from the dataset upon which the contribution of Pauly & Stergiou (2005) was based.Although the published items used to compile Table 1 are far from being comprehensive, they can be used to illustrate some aspects of the proposed new index.
INK yr -1 values are typically <<1.Also, it is apparent that notes, although short, can convey new knowledge, as is evident for chemistry.Thus, such items are bearers of knowledge ('gnosiphores'), and should be considered in scientometrics.
Pauly & Stergiou: Index of New Knowledge Reviews have INK yr -1 generally lower than for articles, at least in our sample.This suggests that the INK, by accounting for large numbers of references in reviews, succeeds at compensating for the positive bias that they get when crude citation counts are used to characterize them as 'better' than regular papers.
We have no doubt that if the INK were to be widely used, some authors would change their citation behavior, e.g.include in their papers fewer references than they should, which would boost the INK of their papers by reducing its denominator.Such behavior is unavoidable (see Lawrence 2008, this Theme Section).This, however, can be controlled, at least in part, by referees, and by editorial guidelines.There remains the issue of journals controlling the number of references per article.We believe that such an editorial policy does not distort INK.This is because authors are asked to refer to all pertinent knowledge (i.e.cite what needs to be cited) required to produce new knowledge (and see Todd et al. 2007 for misciting practices in ecology).
Finally, we note that the INK could be mathematically combined with other scientometric indicators, for example with the h-index, to yield a combined index which would exhibit the advantages of both.

Table 1 .
Indicative values of INK yr -1 .These were derived from a random set of 52 items in 5 disciplinary fields published in the years from 1977 to 2005 (see Appendix 1) and show a high degree of overlap between fields and publication types, in spite of their widely divergent number of citations received (ranges are given where possible)