Mining Demographic Data from Clinical Narratives by using Data Mining Methods especially NLP
My research is basically about the competency of NLP as a data mining method when it comes to mining demographic data from clinical narratives.
In research by Tejal et al. (2017), they researched the use of natural language processing (NLP) and other data mining methods in correlating mammographic and pathologic findings in clinical decisions. They concluded that mammographic imaging characters that were obtained by the use of NLP techniques correlated with the pathologic breast cancer subtype. They further concluded that NLP provided a means that was automated which could be used to scale up the extraction of data and its analysis that would hence be used to support clinical decisions (Tejal et al., 2017). NLP helps the computer to deeply understand what the doctor has analyzed on any patient.
There was also research conducted by Dorothy A Sippo and others which were about the automated extraction of BI-RADS final assessment categories from radiology reports with NLP (Dorothy et al., 2013). They screened mammographic and breast ultrasounds and a combination of both. They also did breast magnetic resonance imaging studies. The NLP recorded a high recall for the breast imaging reports. They also made a finding that the NLP can provide accurate data that is scalable. In this line, the NLP can help in providing strategies that can help in the management of breast cancer.
From these previous studies, I have been able to learn that NLP is an accurate way of data extraction. It is however not used by some institutions due to the question of authenticity. I intend to make the NLP method of data extraction in terms of extracting data from clinical narratives.
Aims of the Research
What I aim to research is how to analyze sentiments and my opinions when it comes to the NLP. How are opinions mined and sentiments analyzed? The other aim is question answering. How does it happen and how effective is it?
I intend to use instantiation where I will come up with instances where I will have to use NLP to see how it will work. The second method that I intend to use is the use of laboratory experiments. I will go to the laboratory and do experiments that align with the use of NLP for data mining. I will also involve the use of secondary data where I will use statistics from other researchers and medical organizations to clarify my research. Lastly, I intend to do library research where I will go to the library and do research on articles that have already been recorded. This means that I will be using both qualitative and quantitative research methods when it comes to conducting my research. These methods will help address my research aims as some of them, such as the laboratory experiments, have been used by other researchers in their collection of data. The use of instantiation will help in figuring out what might happen in certain instances without having the instance at hand. This means that it will help in figuring out issues that perhaps are yet to happen but are prone to happen.
The other challenge that I might be able to face is the understanding of Electronic Medical Records. That is where the NLP comes in but at times, the scenarios are way too complex to process due to failure. This means that I need to know about understanding the EMRs just in case the NLP fails to operate. The milestone of my research will be finding my way through the analysis of sentiments and opinion mining using the NLP. Also, finding out how question answering happens and how effective the process is when NLP is involved.
What I expect from this research is a clear explanation of how the question-answering process in the use of NLP works and its effectiveness. This will help in increasing the authenticity of the NLP which is my main purpose. I also need to look at the mining of opinions and analysis of sentiments. Finding out how opinions can be mined by the use of NLP and how sentiments can be analyzed is of key importance here.
Dorothy A, S. et al., 2013. Automated extraction of BI-RADS final assessment categories from radiology reports with natural language processing. Journal of Digital Imaging, 26(5), pp. 989-994.
Tejal A, P. et al., 2017. Correlating mammographic and pathologic findings in clinical decision support using natural language processing and data mining methods. American Cancer Society Journals, 123(1), pp. 114-121.