Mining Demographic Data from Clinical Narratives by using Data Mining Methods
My research is basically about the competency of NLP as a data mining method when it
comes to mining demographic data from clinical narratives.
In research by Tejal et al. (2017), they researched the use of natural language
processing (NLP) and other data mining methods in correlating mammographic and
pathologic findings in clinical decisions. They concluded that mammographic imaging
characters that were obtained by the use of NLP techniques correlated with the pathologic
breast cancer subtype. They further concluded that NLP provided a means that was
automated which could be used to scale up the extraction of data and its analysis that would
hence be used to support clinical decisions (Tejal et al., 2017). NLP helps the computer to
deeply understand what the doctor has analyzed on any patient.
There was also research conducted by Dorothy A Sippo and others which were about
the automated extraction of BI-RADS final assessment categories from radiology reports
with NLP (Dorothy et al., 2013). They screened mammographic and breast ultrasounds and a
combination of both. They also did breast magnetic resonance imaging studies. The NLP
recorded a high recall for the breast imaging reports. They also made a finding that the NLP
RESEARCH PROPOSAL 2
can provide accurate data that is scalable. In this line, the NLP can help in providing
strategies that can help in the management of breast cancer.
From these previous studies, I have been able to learn that NLP is an accurate way of
data extraction. It is however not used by some institutions due to the question of
authenticity. I intend to make the NLP method of data extraction in terms of extracting data
from clinical narratives.
Aims of the Research
What I aim to research is how to analyze sentiments and my opinions when it comes
to the NLP. How are opinions mined and sentiments analyzed? The other aim is question
answering. How does it happen and how effective is it?
I intend to use instantiation where I will come up with instances where I will have to
use NLP to see how it will work. The second method that I intend to use is the use of
laboratory experiments. I will go to the laboratory and do experiments that align with the use
of NLP for data mining. I will also involve the use of secondary data where I will use
statistics from other researchers and medical organizations to clarify my research. Lastly, I
intend to do library research where I will go to the library and do research on articles that
have already been recorded. This means that I will be using both qualitative and quantitative
research methods when it comes to conducting my research. These methods will help address
my research aims as some of them, such as the laboratory experiments, have been used by
other researchers in their collection of data. The use of instantiation will help in figuring out
what might happen in certain instances without having the instance at hand. This means that
it will help in figuring out issues that perhaps are yet to happen but are prone to happen.
The challenges that I might face along the way of my research is the availability of
clinical records for my research. Getting these clinical records will not be easy and I am fully
the electronic health records of the institutions can be allowed. Overcoming this challenge
means that I will be visiting institutions that have cut down their edges in their clinical NLP
systems by the distribution of EHRs that are de-identified to a community research group that
is broad. This is usually done under agreements of data use that have been set by the
institution. This means that the data is more available to the public who wish to do further
research on issues that are rather not clear to them.
The other challenge that I might be able to face is the understanding of Electronic
Medical Records. That is where the NLP comes in but at times, the scenarios are way too
complex to process due to failure. This means that I need to know about understanding the
EMRs just in case the NLP fails to operate. The milestone of my research will be finding my
way through the analysis of sentiments and opinion mining using the NLP. Also, finding out
how question answering happens and how effective the process is when NLP is involved.
What I expect from this research is a clear explanation of how the question-answering
process in the use of NLP works and its effectiveness. This will help in increasing the
authenticity of the NLP which is my main purpose. I also need to look at the mining of
RESEARCH PROPOSAL 3
opinions and analysis of sentiments. Finding out how opinions can be mined by the use of
NLP and how sentiments can be analyzed is of key importance here.
Dorothy A, S. et al., 2013. Automated extraction of BI-RADS final assessment categories
from radiology reports with natural language processing. Journal of Digital Imaging,
26(5), pp. 989-994.
Tejal A, P. et al., 2017. Correlating mammographic and pathologic findings in clinical
decision support using natural language processing and data mining methods.
American Cancer Society Journals, 123(1), pp. 114-121.