Based on US Government estimates from data outside the VA, the estimates of homelessness among Veterans is nearly 1% 11. Those returning from active combat (Veterans of wars) are known to be higher risk for homelessness 10 and so this non-medical ‘diagnosis’ is important to the US Department of Veteran Affairs (VA). Homelessness is a high-priority issue for all societies. ARC has been used by NLP and clinical researchers in several medical domains such as cancer and surgery 8, 9, thus providing a mechanism for all researchers to access NLP tools in their domains. Another paradigm is to deploy off-the-shelf NLP tools that can be used with minimal training and expertise.Īutomated Retrieval Console v2.0 (ARC), an open source clinical information retrieval tool developed by Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC) 5, is an NLP tool that essentially retrieves ‘documents like this one’ based on a training set that contains sufficient numbers of positive and negative classifications 6, 7. Often, this entails engaging the services of trained NLP scientists and programmers and working closely with them. Be it for clinical operations, quality improvement or research, it is important to develop methods and tools that allow for rapid results using NLP to identify targets of interest. With ever expanding medical databases, there is a need to bring information retrieval tools into the hands of all researchers. The value of text data has been shown in several clinical and biomedical domains including bio-surveillance, adverse event detection and quality improvement 1 – 4.
Information extraction and information retrieval methods using natural language processing (NLP) and machine learning have been successfully applied to electronic notes.
Among us font free#
The free text of electronic medical notes is considered to be a rich source for health care operations and research. We demonstrate an effective and rapid lifecycle of using an off-the-shelf NLP tool for screening targets of interest from medical records.
Further refinements are underway to improve the performance. Human review noted a precision of 70% for these flags resulting in an adjusted prevalence of homelessness of 3.3% which matches current VA estimates. Processing of a naïve set of 10,000 randomly selected documents from the VA using this best performing model yielded 463 documents flagged as positive, indicating a 4.7% prevalence of homelessness. The best performing model based on document level work-flow performed well on a test set (Precision 94%, Recall 97%, F-Measure 96). Using a human-reviewed reference standard corpus of clinical documents of Veterans with evidence of homelessness and those without, an open-source NLP tool (Automated Retrieval Console v2.0, ARC) was trained to classify documents. Homelessness is a high priority non-medical diagnosis that is noted in electronic medical records of Veterans in Veterans Affairs (VA) facilities.
Information retrieval algorithms based on natural language processing (NLP) of the free text of medical records have been used to find documents of interest from databases.