lornet.ca

Lornet.ca

A Design Methodology for a Document Indexing Tool David R. Cheriton School of Computer Science The huge increase in volume of online literature has led Documents contain “text objects” that have many infor- to a parallel surge in research into methods for retrieving mation retrieval uses. These text objects include: textual meaningful information from this textual data—“content items, such as noun phrases, and metadata items, such as extraction” has emerged as a prominent field in natural citations to other articles, hyperlinks to other documents or language computing. However, little progress has as yet web pages, and XML attributes. Respective uses of these been made in determining the pragmatic content of a doc- text objects include: keyword indexing to form links be- ument, ‘hidden’ meaning such as the attitudes of the writer tween keywords and documents; citation indexes; and XML toward her audience, the intentions being communicated, attributes as an important metadata search item.
the intra-textual relationships between document objects, While it is a straightforward task to associate keywords and so forth. But pragmatic information carries a great with documents or build citation indexes which facilitate deal of the underlying meaning in a document, and the in- searches that ensure a high rate of recall in a search, the ability to access this information means that current content presence of a keyword or citation link does not necessar- extraction methods are very uninformed.
ily mean a correspondingly high search precision. To im- Our goal is to develop natural language systems capable prove search precision, each link should ideally be labelled of extracting this pragmatic information in text to provide with a domain-specific descriptive category that indicates a more meaningful document understanding. To this end, we likely reason for the link. We propose to develop automated are developing automated methods, both discourse-based methods of link classification providing such typed links to and using Machine Learning techniques, to recognize and enable more-effective literature indexing and analysis tools.
interpret pragmatic cues in text. This pragmatic evidence Our initial task is to construct an annotation tool for may then be used to provide more-sophisticated document manually classifying rhetorical and other pragmatic cues in indexing to guide information extraction by providing de- online texts to provide a training corpus for developing our tailed information on the fine-grained nature of the linking automated document-link classification system.
∗Authors are listed in alphabetical order. An earlier version of this pa- per was given as a poster at the 2004 Joint Conference on Human LanguageTechnology/North American Association for Computational Linguistics(HLT-NAACL) (BioLink 2004: Workshop on Linking Biological Litera- With the explosion in the amount of online literature, our ture, Ontologies and Databases: Tools for Users), Boston, May 2004.
current techniques for information exploration have been overwhelmed. If we could recognize and use fine-grained Once we have determined the purpose of a citation, we relationships among documents to assist navigation through can then use this knowledge to group together articles and information networks, we could better address this problem.
authors into clusters that will allow better navigation of Suppose that we wish to label a link to the following news the literature in a subject domain, and mapping to social article which is cited by a competitor company analysis: networks within a scientific community. We are applying “The U.S. Food and Drug Administration is planning knowledge from Computational Linguistics and Machine to reverse additional patent protection for Biovail Corp.’s Learning to develop methods and software tools for auto- Tiazac, setting the stage for potential generic competition matically determining the function of citations. It is ex- against the Mississauga company’s flagship drug.” (The pected that these results will then be applicable to related Globe and Mail, Saturday 5 March 2001, page B2.) problems in classifying other types of links and hyperlinks Suppose also that we wish to label this link with either “Favourable development for competitor” or “Unfavourable Our resources include specialized repositories of development for competitor”. If we extract just the posi- biomedical articles (10,000) and physics articles (30,000), tive phrase “additional patent protection for [competitor’s as well as the entire BioMed Central corpus. Our initial goal product]” then, without additional information, this article is to build a training set of manually classified citations in would be labelled as “Favourable”. However, the positive biomedical articles (using a set of 1000 protein-interaction phrase is obviously in the negative context indicated by “re- articles we have curated from the larger biomedical corpus) verse’, so it should have been labelled as “Unfavourable”.
that we could then use for developing our learning algo- If the verb had instead been “continue” (a positive context) rithms and for building scientific social networks.
then the positive sense would again prevail.
We have developed an initial annotation tool for manu- It is obvious from this example that an analysis of the ally classifying citations in scientific articles and now plan text object context is crucial. What is not obvious is that to extend the tool to classify other types of surface prag- the context could be structurally larger than just the enclos- matic cues (e.g., hedging cues, indicators of uncertainty).
ing sentence, even as large as a paragraph, the entire doc- These cues will then provide a training corpus to develop ument, or a set of documents. The goal of this project is automated methods for classifying the types of links be- to develop new methods for discovering contextual infor- mation vital to the interpretation of text objects found indocuments. This information can then be used to label links to the document that use the textual object. Although deepanalysis of text would be required for complete understand-ing of all the nuanced relationships between documents, it is 1. Development of Machine Learning algorithms (e.g., our contention that surface-cue and stylistic analysis, easier using Hidden Markov Models, Conditional Random and more tractable than full syntactic and semantic under- Fields) for detection of linguistic features in text standing, can provide much of the information that will be relevant to citation function (R. Radoulov, Master’s 2. Development of Machine Learning methods and software tools for automated classification of cita- We are bootstrapping the development of a set of meth- tions (J. Taylor, PhD student, UWO; R. Radoulov, ods and software tools for the automated classification of links between documents in online corpora by focusing ini-tially on the problem of automated citation classification inscientific articles. This is a particularly challenging prob- 3. Analysis of discourse and argumentation structure lem as there can be upwards of 35 citation categories used (e.g., using lexical chaining, lexical style, classical in scholarly writing, with fine-grained distinctions among argumentation models) as cues to citation function the category definitions. Determining the purpose of a cita- and inter-document relations (T. Maynard, Master’s tion can involve recognizing linguistic features at all levels student, UWO; B. White, PhD student, UWO; C.
of the text: lexical cues, syntactic arrangement, and over- all discourse structure. We have demonstrated that auto-mated citation classification is feasible, but to improve theperformance of our classifier we need more-sophisticated 4. Using citation network analysis to map the structure techniques blending discourse understanding with statisti- of scientific communities (F. Kroon, PhD student, cal methods for large-scale corpus analysis.
(1) Although the 3-D structure analysis by x-ray crys- tallography is still in progress (Eger et al., 1994;Kelly, 1994), it was shown by electron microscopy that XO consists of three submasses (Coughlan et A citation index enables efficient retrieval of documents Indexing tools, such as CiteSeer [3], play an important from a large collection—a citation index consists of source role in the scientific endeavour by providing researchers items and their corresponding lists of bibliographic descrip- with a means to navigate through the network of schol- tions of citing works. The use of citation indexing of sci- arly scientific papers using the connections provided by ci- entific articles was invented by Dr. Eugene Garfield in the tations. Citations relate articles within a research field by 1950s as a result of studies on problems of medical in- linking together works whose methods and results are in formation retrieval and indexing of biomedical literature.
some way mutually relevant. Customarily, authors include Dr. Garfield later founded the Institute for Scientific Infor- citations in their papers to indicate works that are foun- mation (ISI), whose Science Citation Index [4] is now one dational in their field, background for their own work, or of the most popular citation indexes. Recently, with the ad- representative of complementary or contradictory research.
vent of digital libraries, Web-based indexing systems have Another researcher may then use the presence of citations begun to appear (e.g., ISI’s ‘Web of Knowledge’, CiteSeer to locate articles she needs to know about when entering a new field or to read in order to keep track of progress in a Authors of scientific papers normally include citations field where she is already well-established. But, with the ex- in their papers to indicate works that are connected in an plosion in the amount of scientific literature, a means to pro- important way to their paper. Thus, a citation connecting vide more information in order to give more intelligent con- the source document and a citing document serves one of trol to the navigation process is warranted. A user normally many functions. For example, one function is that the cit- wants to navigate more purposefully than “Find all articles ing work gives some form of credit to the work reported citing a source article”. Rather, the user may wish to know in the source article. Another function is to criticize pre- whether other experiments have used similar techniques to vious work. Other functions include foundational works in those used in the source article, or whether other works have their field, background for their own work, works which are reported conflicting experimental results. In order to navi- representative of complementary or contradictory research.
gate a citation index in this more-sophisticated manner, the Determining the nature of the exact relationship between a citation index must contain not only the citation-link infor- citing and cited paper, often requires some level of under- mation, but also must indicate the function of the citation in standing the text that the citation is embedded in.
The near-term goal of our research project is the imple- mentation of an indexing tool for scholarly scientific liter- ature which uses rhetorical and other pragmatic cues in thecontext surrounding a citation to provide information about In the biomedical field, a domain of particular interest the relationship between the two papers connected by the ci- to us, we believe that the usefulness of automated citation tation. Ultimately, we hope to apply the methods and tools classification in literature indexing can be found in both the we will develop in classification of more-general kinds of larger context of managing entire databases of scientific ar- document links to enhance literature indexing schemes, im- ticles or for specific information-extraction problems. On prove document retrieval precision, and advance social net- the larger scale, database curators need accurate and effi- cient methods for building new collections by retrieving ar-ticles on the same topic from huge general databases. Sim- ple systems (e.g., [1], [13]) consider only keyword frequen-cies in measuring article similarity. More-sophisticated sys- A citation may be formally defined as a portion of a sen- tems, such as the Neighbors utility [22], may be able to lo- tence in a citing document which references another docu- cate articles that appear to be related in some way (e.g., find- ment or a set of other documents collectively. For example, ing related Medline abstracts for a set of protein names [2]), in sentence 1 below, there are two citations: the first cita- but the lack of specific information about the nature and tion is Although the 3-D structure. . . progress, with the set validity of the relationship between articles may still make of references (Eger et al., 1994; Kelly, 1994); the second ci- the resulting collection a less-than-ideal resource for subse- tation is it was shown. . . submasses with the single reference quent analysis. Citation classification to indicate the nature of the relationships between articles in a database would make the task of building collections of related articles both may be resolved through the availability of citation catego- easier and more accurate. And, the existence of additional rization in curated texts: synonym detection, for example, knowledge about the nature of the linkages between articles may be enhanced if different names for the same entity oc- would greatly enhance navigation among a space of docu- cur in articles that can be recognized as being closely related ments to retrieve meaningful information about the related A specific problem in information extraction that may benefit from the use of citation categorization involves min-ing the literature for protein-protein interactions (e.g., [2], [13], [21]). Currently, even the most-sophisticated systemsare not yet capable of dealing with all the difficult problems The automated labelling of citations with a specific cita- of resolving ambiguities and detecting hidden knowledge.
tion function requires an analysis of the linguistic features For example, Blaschke et al.’s system [2] is able to handle in the text surrounding the citation, coupled with a knowl- fairly complex problems in detecting protein-protein inter- edge of the author’s pragmatic intent in placing the citation actions, including constructing the network of protein inter- at that point in the text. The author’s purpose for includ- actions in cell-cycle control, but important implicit knowl- ing citations in a research article reflects the fact that re- edge is not recognized. In the case of cell-cycle analysis for searchers wish to communicate their results to their scien- Drosophila, their system is able to determine that relation- tific community in such a way that their results, or knowl- ships exist between Cak, Cdk7, CycH, and Cdk2: Cak in- edge claims, become accepted as part of the body of sci- hibits/phosphorylates Cdk7, Cak activates/phosphorylates entific knowledge. This persuasive nature of the scientific Cdk2, Cdk7 phosphorylates Cdk2, CycH phosphorylates research article, how it contributes to making and justifying Cak and CycH phosphorylates Cdk2. However, the sys- a knowledge claim, is recognized as the defining property tem is not able to detect that Cak is actually a complex of scientific writing by rhetoricians of science, e.g., [7], [8], formed by Cdk7 and CycH, and that the Cak complex reg- [9], [17]. Style (lexical and syntactic choice), presentation ulates Cdk2. While the earlier literature describes inter- (organization of the text and display of the data), and ar- relationships among these proteins, the recognition of the gumentation structure are noted as the rhetorical means by generalization in their structure, i.e., that these proteins are which authors build a convincing case for their results.
part of a complex, is contained only in more-recent articles: Our approach to automated citation classification is “There is an element of generalization implicit in later pub- based on the detection of fine-grained linguistics cues in lications, embodying previous, more dispersed findings. A scientific articles that help to communicate these rhetori- clear improvement here would be the generation of asso- cal stances and thereby map to the pragmatic purpose of ciated weights for texts according to their level of gener- citations. As part of our overall research methodology, our ality” [2]. Citation categorization could provide just these goal is to map the various types of pragmatic cues in sci- kind of ‘ancestral’ relationships between articles—whether entific articles to rhetorical meaning. Our previous work an article is foundational in the field or builds directly on has described the importance of discourse cues in enhanc- closely related work—and, if automated, could be used in ing inter-article cohesion signalled by citation usage [15], forming collections of articles for study that are labelled [12]. We have also been investigating another class of prag- with explicit semantic and rhetorical links to one another.
matic cues, hedging cues, [16], that are deeply involved in Such collections of semantically linked articles might then creating the pragmatic effects that contribute to the author’s be used as ‘thematic’ document clusters (cf. Wilbur [23]) to knowledge claim by linking together a mutually support- elicit much more meaningful information from documents ive network of researchers within a scientific community.
In extending our work to more-general types of document An added benefit of having citation categories available links, we are exploring other types of pragmatic connota- in text corpora used for studies such as extracting protein- tions, including certainty categorization and how explicitly protein interactions is that more, and more-meaningful, in- marked certainty can be predictably and dependably identi- fied from newspaper article data. Certainty identification, in Blaschke et al. [2] noted that they were able to discover particular, can serve as a foundation for a novel type of text many more protein-protein interactions when including in analysis that can enhance question-and-answering, search, the corpus those articles found to be related by the Neigh- and information retrieval capabilities ([18], [19]). Certainty bors facility [22] (285 versus only 28 when relevant protein identification is a part of the new and exciting direction in names alone were used in building the corpus). Lastly, very information retrieval, natural language processing, and text- difficult problems in scientific and biomedical information mining, concerned with exploration of subjective, attitudi- extraction that involve aspects of deep-linguistic meaning nal, and affective aspects of texts [20].
We investigated this hypothesis by doing a frequency analysis of hedging cues in citation contexts in a corpus of In our preliminary study [15], we analyzed the frequency 985 biology articles. We obtained statistically significant of the cue phrases from [14] in a set of scholarly scientific results (summarized in Table 1) indicating that hedging is articles. We reported strong evidence that these cue phrases used more frequently in citation contexts than the text as are used in the citation sentences and the surrounding text a whole. Given the presumption that writers make stylis- with the same frequency as in the article as a whole. In sub- tic and rhetorical choices purposefully, we propose that we sequent work [12], we analyzed the same dataset of articles have further evidence that connections between fine-grained to begin to catalogue the fine-grained discourse cues that linguistic cues and rhetorical relations exist in citation con- exist in citation contexts. This study confirmed that authors do indeed have a rich set of linguistic and non-linguistic Table 1 shows the proportions of the various types of sen- methods to establish discourse cues in citation contexts.
tences that contain hedging cues, broken down by hedging- Another type of linguistic cue that we are studying is re- cue category (verb or nonverb cues), according to the dif- lated to hedging effects in scientific writing that are used ferent sections in the articles (background, methods, results by an author to modify the affect of a ‘knowledge claim’.
and discussion, conclusions). For all but one combination, Hedging in scientific writing has been extensively studied citation sentences are more likely to contain hedging cues by Hyland [9], including cataloging the pragmatic func- than would be expected from the overall frequency of hedge tions of the various types of hedging cues. As Hyland [9] sentences (p ≤ .01). Citation ‘window’ sentences (i.e., sen- explains, “[Hedging] has subsequently been applied to the tences in the text close to a citation) generally are also sig- linguistic devices used to qualify a speaker’s confidence in nificantly (p ≤ .01) more likely to contain hedging cues the truth of a proposition, the kind of caveats like I think, than expected, though for certain combinations (methods, perhaps, might, and maybe which we routinely add to our verbs and nonverbs; res+disc, verbs) the difference was not statements to avoid commitment to categorical assertions.
Hedges therefore express tentativeness and possibility in Tables 2, 3, and 4 summarize the occurrence of hedging communication, and their appropriate use in scientific dis- cues in citation ‘contexts’ (a citation sentence and the sur- rounding citation window). Table 5 shows the proportion of The following examples illustrate some of the ways in hedge sentences that either contain a citation, or fall within which hedging may be used to deliberately convey an atti- a citation window; Table 5 suggests (last 3-column column) tude of uncertainty or qualifification. In the first example, that the proportion of hedge sentences containing citations the use of the verb suggested hints at the author’s hesitancy or being part of citation windows is at least as great as what to declare the absolute certainty of the claim: would be expected just by the distribution of citation sen-tences and citation windows.
(2) The functional significance of this modulation Table 1 indicates (statistically significant) that in most is suggested by the reported inhibition of MeSo- cases the proportion of hedge sentences in the citation con- induced differentiation in mouse erythroleukemia texts is greater than what would be expected by the distribu- cells constitutively expressing c-myb.
tion of hedge sentences. Taken together, these conditional In the second example, the syntactic structure of the sen- probabilities support the conjecture that hedging cues and tence, a fronted adverbial clause, emphasizes the effect of citation contexts correlate strongly. Hyland [9] has cata- qualification through the rhetorical cue Although. The sub- logued a variety of pragmatic uses of hedging cues, so it sequent phrase, a certain degree, is a lexical modifier that is reasonable to speculate that these uses can be mapped also serves to limit the scope of the result: to the rhetorical meaning of the text surrounding a citation,and from thence to the function of the citation.
(3) Although many neuroblastoma cell lines show a certain degree of heterogeneity in terms of neuro- transmitter expression and differentiative potential,each cell has a prevalent behavior in response to dif-ferentiation inducers.
The indexing tool that we are designing is an enhanced citation index. The feature that we are adding to a standard In [16], we showed that the hedging cues proposed by citation index is the function of each citation, that is, given Hyland occur more frequently in citation contexts than in an agreed-upon set of citation functions, we want our tool the text as a whole. With this information we conjecture to be able to automatically categorize a citation into one of that hedging cues are an important aspect of the rhetorical these functional categories. To accomplish this automatic relations found in citation contexts and that the pragmatics categorization we are using a decision tree—currently, we of hedges may help in determining the purpose of citations.
are building the decision tree by hand, but in future we in- Table 1. Proportion of sentences containing hedging cues, by type of sentence and hedging cue
category.
Table 2. Number and proportion of citation contexts containing a hedging cue, by section and loca-
tion of hedging cue.
tend to investigate machine learning techniques to induce edge about the IMRaD structure1 of the article together with a tree. Our aim is to have a working indexing tool when- some simple syntactic structure of the citation-containing ever we add more knowledge to the categorization process.
sentence. The prototype uses 35 citation categories. In addi- This goal appears very feasible given our design method- tion to having a design which allows for easy incorporation ology choice of using a decision tree: adding more knowl- of more-sophisticated knowledge, it also gives flexibility to edge only refines the decision-making procedure of the pre- the tool: categories can be easily coalesced to give users a tool that can be tailored to a variety of uses.
Two factors influence the development of the tree as fol- Although we anticipate some small changes to the num- ber of categories due to category refinement, the majormodifications to the decision tree will be driven by a more- • The granularity of the citation categories determines sophisticated set of features associated with each citation.
how many leaves are in the decision tree; and When investigating a finer granularity of the IMRaD struc- • The number of features that can be used to deter- ture, we came to realize that the structure of scientific writ- mine the category of a citation determines the po- ing at all levels of granularity was founded on rhetoric, which involves both argumentation structure as well asstylistic choices of words and syntax. This was the moti- In earlier work, Garzone and Mercer ([5], [6]) proposed vation for choosing the rhetoric of science as our guiding a citation classification scheme that, with 35 categories, was both more comprehensive than the union of all of the pre- We rely on the notion that rhetorical information is real- vious schemes and also amenable to implementation in an ized in linguistic ‘cues’ in the text, some of which, although automated citation classifier. We use this categorization in not all, are evident in surface features (cf. Hyland [9] on the citation classifiers, but a finer or coarser granularity is surface hedging cues in scientific writing). Since we antic- ipate that many such cues will map to the same rhetorical Concerning the features on which the decision tree features that give evidence of the text’s argumentative and makes its decisions, we have started with a simple, yet fully pragmatic meaning, and that the interaction of these cues automatic prototype [5] which takes journal articles as input will likely influence the text’s overall rhetorical effect, the and classifies every citation found therein. Its decision tree 1The corpus of biomedical papers all have the standard Introduction, is very shallow, using only sets of cue-words and polarity Methods, Results, and Discussion or a slightly modified version in which switching words (not, however, etc.), some simple knowl- Table 3. Proportion of citation contexts containing a verbal hedging cue, by section and location of
hedging cue.
Table 4. Proportion of citation contexts containing a nonverb hedging cue, by section and location
of hedging cue.
formal rhetorical relation (cf. [11]) appears to be the appro- Not surprisingly, the morphology of scientific terminology priate feature for the basis of the decision tree. So, our long- exhibits comparison and contrasting features, for example, term goal is to map between the textual cues and rhetorical exo- and endo-. Science needs to measure, so scientific relations. Having noted that many of the cue words in the writing contains measurement cues by referring to scales prototype are discourse cues, and with two recent impor- (0–100), or using comparatives (larger, brighter, etc.). Ex- tant works linking discourse cues and rhetorical relations periments are described as a sequence of steps, so this is an ([10, 14]), we began our investigation of this mapping with discourse cues. We have some early results that show that Finally, as for our prototype system, we will continue to discourse cues are used extensively with citations and that evaluate the classification accuracy of the citation-indexing some cues appear much more frequently in the citation con- tool by a combination of statistical testing and validation text than in the full text [15]. Another textual device is the by human experts. In addition, we would like to assess the hedging cue, which we are currently investigating [16].
tool’s utility in real-world applications such as database cu- Although our current efforts focus on cue words which ration for studies in biomedical literature analysis. We have are connected to organizational effects (discourse cues), and suggested earlier that there may be many uses of this tool, writer intent (hedging cues), we are also interested in other so a significant aspect of the value of our tool will be its types of cues that are associated more closely to the purpose ability to enhance other research projects.
and method of science. For example, the scientific methodis, more or less, to establish a link to previous work, set up an experiment to test an hypothesis, perform the exper-iment, make observations, then finally compile and discussthe importance of the results of the experiment. Scientific The pragmatic connotations of citation function and writing reflects this scientific method and its purpose: one other types of document links are a feature of scientific may find evidence even at the coarsest granularity of the writing which can be exploited in a variety of ways. We IMRaD structure in scientific articles. At a finer granular- anticipate more-informative citation and document indexes ity, we have many targetted words to convey the notions as well as more-intelligent database curation. Additionally, of procedure, observation, reporting, supporting, explain- sophisticated information extraction may be enhanced when ing, refining, contradicting, etc. More specifically, science better selection of the dataset is enabled. For example, syn- categorizes into taxonomies or creates polarities. Scien- onym detection in a corpus of papers may be made more tific writing then tends to compare and contrast or refine.
tractable when the corpus is comprised of related papers de- Table 5. Proportion of hedge sentences that contain citations or are part of a citation window, by
section and hedging cue category.
rived from navigating a space of linked citations.
[4] E. Garfield. Information, power, and the science citation In this paper we have motivated our approach to devel- index. In Essays of an Information Scientist, Volume 1. In- oping a literature indexing tool that computes the functions stitute for Scientific Information, 1962–1973.
of citations. The function of a citation is determined by ana- [5] M. Garzone. Automated classification of citations using lin- lyzing the rhetorical intent of the text that surrounds it. This guistic semantic grammars. M.Sc. Thesis, The University ofWestern Ontario, 1996.
analysis is founded on the guiding principle that the scien- [6] M. Garzone and R. Mercer. Towards an automated citation tific method is reflected in scientific writing.
classifier. In Proceedings of the 13th Biennial Conference of Our early investigations have determined that linguistic the CSCSI/SCEIO (AI’2000), pages 337–346. Lecture Notes cues and citations are related in important ways. Our future in Artificial Intelligence, volume 1822, H.J. Hamilton (ed.), work will be to map these linguistic cues to rhetorical rela- tions and other pragmatic functions so that this information can then be used to determine the purpose of citations and from thence to more-general document links. The results of [8] A. Gross, J. Harmon, and M. Reidy. Communicating Sci- our research will be a set of algorithms, methods, and soft- ence: The Scientific Article from the 17th Century to the ware tools that can be applied to the following problems in Present. Oxford University Press, 2002.
[9] K. Hyland. Hedging in Scientific Research Articles. John • Automated analysis of document content for cues to [10] A. Knott. A data-driven methodology for motivating a set of coherence relations. Ph.D. thesis, University of Edinburgh,1996.
• Automated classification of semantic links between [11] W. Mann and S. Thompson. Rhetorical structure theory: Toward a functional theory of text organization. Text, 8(3),1988.
• Mapping from typed document links to social net- citation-related rhetorical cues in scientific texts. In Pro-ceedings of the Pacific Association for Computational Lin- guistics (PACLING 2003) Conference, Halifax, Canada,2003.
[13] E. M. Marcotte, I. Xenarios, and D. Eisenberg.
literature for protein-protein interactions. Bioinformatics, [1] M. A. Andrade and A. Valencia. Automatic extraction of keywords from scientific text: Application to the knowledge [14] D. Marcu. The rhetorical parsing, summarization, and gen- domain of protein families. Bioinformatics, 14(7):600–607, eration of natural language texts. Ph.D. thesis, University of [2] C. Blaschke, M. A. Andrade, C. Ouzounis, and A. Valencia.
[15] R. Mercer and C. DiMarco. The importance of fine-grained Automatic extraction of biological information from scien- cue phrases in scientific citations. In Proceedings of the tific text: Protein-protein interactions. In International Con- 16th Conference of the CSCSI/SCEIO (AI’2003), Halifax, ference on Intelligent Systems for Molecular Biology (ISMB [16] R. Mercer and C. DiMarco. The frequency of hedging cues [3] B. Bollacker, S. Lawrence, and C. Giles. A system for au- in citation contexts in scientific writing. In Proceedings of tomatic personalized tracking of scientific literature on the the 17th Conference of the CSCSI/SCEIO (AI’2004), Lon- web. In Digital Libraries 99—The Fourth ACM Conference on Digital Libraries, pages 105–113, New York, 1999. ACM [17] G. Myers. Writing Biology. University of Wisconsin Press, [18] V. Rubin, N. Kando, and E. Liddy. Certainty categorization model. In AAAI Spring Symposium: Exploring Attitude andAffect in Text: Theories and Applications, Stanford, USA,2004.
[19] V. Rubin, E. Liddy, and N. Kando. Certainty identification in texts: Categorization model and manual tagging results.
In In: J.G. Shanahan, Y. Qu and J. Wiebe (Eds.), ComputingAttitude and Affect in Text: Theory and Applications (theInformation Retrieval Series): Springer-Verlag, New York,2005.
[20] J. Shanahan, Y. Qu, and J. W. (Eds.). Computing Attitude and Affect in Text: Theory and Applications (the InformationRetrieval Series). Springer-Verlag, New York, 2005.
[21] J. Thomas, D. Milward, C. Ouzounis, S. Pulman, and Automatic extraction of protein interactions from scientific abstracts. In Proceedings of the 5th PacificSymposium on Biocomputing (PSB 2000), pages 538–549,2000.
[22] W. Wilbur and L. Coffee. The effectiveness of document neighboring in search enhancement. Information ProcessingManagement, 30:253–266, 1994.
[23] W. J. Wilbur. A thematic analysis of the aids literature. In Proceedings of the 7th Pacific Symposium on Biocomputing(PSB 2004), pages 386–397, 2002.

Source: http://www.lornet.ca/Portals/10/I2LOR06/6_A%20Design%20Methodology%20for%20a%20document%20Indexing%20Tool%20Using%20Pragmatic%20Evidence%20In%20Text.pdf

Things to consider

Things to consider about your birth. Support Team: - Who do I want present? Partner, doula, family? Is What comfort measures would I like to try? - Do I want to limit personnel, students/observers, etc? Environment: - Will I wear my own clothes or the hospital gown?- Would I like music, television, silence?- Would I like the lights dimmed & curtains drawn? - Focused relaxat

Microsoft word - tdm linearity insert azer

SAMPLE CALCULATION: If the Mean Recovered value for Level 4 = 10.1, you can calculate Theoretical Values by multiplying 10.1 by the “Linearity Factor” associated with each level. For example 189 Twin County Rd. Morgantown, PA 19543 Therapeutic Drug Monitoring Linearity Test Set INTENDED USE: Therapeutic Drug Monitoring Test Sets are for in vitro diagnostic use in verifying Lev