LinkedInAnkur Agrawal

Ankur Agrawal

Assistant Professor & Assistant Chair of Computer Science at Manhattan College

Bronx, New York
Higher Education
  1. Manhattan College
  1. New Jersey Institute of Technology,
  2. Kantipur City College
  1. New Jersey Institute of Technology

Join LinkedIn & access Ankur's full profile

Join LinkedIn & access Ankur's full profile. It's free!

As a LinkedIn member, you'll join 300 million other professionals who are sharing connections, ideas, and opportunities.

  • See who you know in common
  • Get introduced
  • Contact Ankur directly
Manhattan College

Manhattan College

Assistant Chair of Computer Science

– Present

View full profile



Ankur's research work focuses on developing automated and semi-automated techniques using structural and lexical methodologies for effective quality assurance of medical terminologies such as SNOMED CT. He also teaches various computer science courses and is actively involved in service.

Pubmed Link:


Assistant Chair of Computer Science

Manhattan College
– Present (3 months)Greater New York City Area

Assistant Professor

Manhattan College
– Present (1 year 3 months)Greater New York City Area

Database Systems
Data Mining
JAVA Programming
C++ Programming
Python Programming

Independent research/studies

Conceptualized and formulated CS Award
Faculty Research Committee
Assessment Committee
Accreditation Committee
Strategic Planning Committee

Instructor/ Teaching Assistant/ Research Assistant

New Jersey Institute of Technology
(5 years 8 months)Greater New York City Area


Kantipur City College
(1 year 1 month)Nepal


New Jersey Institute of Technology

PhD, Computer Science

Research work in designing of efficient automated and semi-automated techniques to audit Medical Terminologies like SNOMED CT

Activities and Societies: President of Graduate Student Association at NJIT 2010-2011 President of DeepCS (Graduate Student Body of the College of Computing Sciences at NJIT) 2009-2010

Kantipur City College (KCC)

Bachelor of Engineering, Computer Engineering

Computer Engineering

Activities and Societies: Student Council

Honors & Awards

Summer Research Grant

Manhattan College

AMIA 2013 Distinguished Paper Award Receipent

American Medical Informatics Association

Finalist in MedInfo 2013 Student Paper Competition

World Congress on Medical and Health Informatics

Nominated as one of the eight finalists in the student paper competition

AIIM's Top 25 Hottest Article in Artificial Intelligence in Medicine


NJIT Presidential Leadership Award 2012


Demonstrate the highest level of excellence over a sustained period
Demonstrate high ideals of a student scholar
Demonstrates extra-curricular involvement
Demonstrate a positive impact on the NJIT student community

NJIT Outstanding Graduate Class Award 2011


Exemplifies the high ideals of a student scholar
Demonstrates extra-curricular involvement

Additional Honors & Awards

NJIT Innovation Award for the Masquerade Ball 2011
Best GSA Club Award for DeepCS in 2009
NJIT Research/Teaching Assistantship, 2008 - Present



Founder & President

NJIT Graduate Student Association


Call and preside over all official meetings
Serve as the liaison between graduate students and NJIT
Manage a budget of $200,000 in planning and organizing events
Appoint committees

DeepCS - NJIT College of Computing Sciences Graduate Student Body


Create a forum of intellectual interactivity
Improve the student-faculty interaction
Invite speakers to share their experiences
Have the graduate student's concerns be heard in the school


Contrasting Lexical Similarity and Formal Definitions in SNOMED CT: Consistency and Implications(Link)

Journal of Biomedical Informatics
February 2014

The objective of this study is to quantify the presence of and evaluate an approach for detection of inconsistencies in the formal definitions of SNOMED CT (SCT) concepts utilizing a lexical method. Utilizing SCT’s Procedure hierarchy, we algorithmically formulated similarity sets: groups of concepts with similar lexical structure of their fully specified name. We formulated five random samples, each with 50 similarity sets, based on the same parameter: number of parents, attributes, groups, all the former as well as a randomly selected control sample. All samples’ sets were reviewed for types of formal definition inconsistencies: hierarchical, attribute assignment, attribute target values, groups, and definitional. The evaluation revealed that 38 (Control) to 70 percent (Different relationships) of similarity sets within the samples exhibited significant inconsistencies. The rate of inconsistencies for the sample with different relationships was highly significant compared to Control, as well as the number of attribute assignment and hierarchical inconsistencies within their respective samples. While, at this time of the HITECH initiative, the formal definitions of SCT are only a minor consideration, in the grand scheme of sophisticated, meaningful use of captured clinical data, they are essential. However, significant portion of the concepts in the most semantically complex hierarchy of SCT, the Procedure hierarchy, are modeled inconsistently in a manner that affects their computability. Lexical methods can efficiently identify such inconsistencies and possibly allow for their algorithmic resolution.


The Readiness of SNOMED Problem List Concepts for Meaningful Use of Electronic Health Records(Link)

Artificial Intelligence in Medicine
April 2013

By 2015, SNOMED CT (SCT) will become the USA’s standard for encoding diagnoses and problem lists in electronic health records (EHRs). To facilitate this effort, the National Library of Medicine has published the SCT CORE and VA/KP problem lists (collectively, the PL). The PL is studied in regard to its readiness to support meaningful use of EHRs. In particular, we wish to determine if inconsistencies appearing in SCT, in general, occur as frequently in the PL, and whether further quality-assurance (QA) efforts on the PL are required. A study is conducted where two random samples of SCT concepts are compared. The first consists of concepts strictly from the PL and the second contains general SCT concepts distributed proportionally to the PL’s in terms of their hierarchies. Each sample is analyzed for its percentage of primitive concepts and for frequency of modeling errors of various severity levels as quality measures. A simple structural indicator, namely, the number of parents, is suggested to locate high likelihood inconsistencies in hierarchical relationships. The effectiveness of this indicator is evaluated. PL concepts are found to be slightly better than other concepts in the respective SCT hierarchies with regards to the quality measure of the percentage of primitive concepts and the frequency of modeling errors. The structural indicator of number of parents is shown to be statistically significant in its ability to identify concepts having a higher likelihood of inconsistencies in their hierarchical relationships. The absolute number of errors in the group of concepts having 1–3 parents was shown to be significantly lower than that for concepts with 4–6 parents and those with 7 or more parents based on Chi-squared analyses.


Identifying Problematic Concepts in SNOMED CT using a Lexical Approach(Link)

Studies in Health Technology and Informatics (MedInfo 2013)
August 2013

SNOMED CT (SCT) has been endorsed as a premier clinical terminology by many organizations with a perceived use with-in electronic health records and clinical information systems. However, there are indications that, at the moment, SCT is not optimally structured for its intended use by healthcare practitioners. A study is conducted to investigate the extent of inconsistencies among the concepts in SCT. A group auditing technique to improve the quality of SCT is introduced that can help identify problematic concepts with a high probability. Positional similarity sets are defined which are groups of con-cepts that are lexically similar and the position of the differing word in the fully specified name of the concepts of a set correspond to each other. A manual auditing of a sample of such sets found 38% of the sets exhibiting one or more inconsistent concepts. Group auditing techniques such as this can thus be very helpful to assure the quality of SCT which will help expedite its adoption as a reference terminology for clinical purposes.


Identifying Inconsistencies in SNOMED CT Problem Lists using Structural Indicators(Link)

American Medical Informatics Association
November 2013

The National Library of Medicine has published the CORE and the VA/KP problem lists (PL) to facilitate the usage of SNOMED CT for encoding diagnoses and clinical data of patients in electronic health records. Therefore, it is essential for the content of the PL to be as accurate and consistent as possible. This study assesses the effectiveness of using a concept’s word length and number of parents as structural indicators for measuring concept complexity and to identify inconsistencies with high probability. The method is able to isolate concepts with over 40% of them being erroneous. A structural indicator for concepts having erroneous synonyms is also presented which is able to identify 52% of the examined concepts as having errors in synonyms. The results demonstrate that the concepts in PL are not free of inconsistencies and further quality assurance is needed to improve the quality of these concepts.


A Family-based Framework for Supporting Quality Assurance of Biomedical Ontologies in BioPortal(Link)

American Medical Informatics Association
November 2013

BioPortal contains over 300 ontologies, for which quality assurance (QA) is critical. Abstraction networks (ANs), compact summarizations of ontology structure and content, have been used in such QA efforts, typically in a “oneoff” manner for a single ontology. Ontologies can be characterized—independently of knowledge-content focus from a structural standpoint leading to the formulation of ontology families. A family is defined as a set of ontologies satisfying some overarching condition regarding their structural features. Seven such families, comprising 186 ontologies, are identified. To increase efficiency, a new family-based QA framework is introduced in which an automated, uniform AN derivation technique semi-automated accompanying uniform QA regimen are applicable to the ontologies of a given family. Specifically, across an entire family, the QA efforts exploit family wide AN features in the characterization of sets of classes that are more likely to harbor errors. The approach is demonstrated on the Cancer Chemoprevention BioPortal ontology.


Deriving an Abstraction Network to Support Quality Assurance in OCRe(Link)

American Medical Informatics Association

An abstraction network is an auxiliary network of nodes and links that provides a compact, high-level view of an ontology, which is typically large and complex in nature. Such a view lends support to ontology orientation, comprehension, and quality-assurance efforts. A methodology is presented for deriving a kind of abstraction network, called a partial-area taxonomy, for the Ontology of Clinical Research (OCRe), covering the topic of studies with human subjects. OCRe was selected as a representative of ontologies implemented using the Web Ontology Language (OWL). The derivation of the partial-area taxonomy for one specific hierarchy of OCRe, namely, the Entity hierarchy, is described. Utilizing the enhanced visualization of the content and structure of the hierarchy provided by the taxonomy, the Entity hierarchy is audited, and several errors and inconsistencies in OCRe’s modeling of its domain are exposed. After appropriate corrections are made to OCRe, a new partial-area taxonomy is derived. The generalizability of the paradigm of the derivation methodology to various families of biomedical ontologies is discussed.


Dissimilarities in the Logical Modeling of Apparently Similar Concepts in SNOMED CT(Link)

American Medical Informatics Association

Concepts whose terms are of a similar word structure are expected to have similar logical representations. Anecdotal examples from SNOMED CT indicate that this may not always be the case. An investigation into the extent of inconsistent modeling in SNOMED CT hierarchies is carried out. A lexical methodology is used to identify sets of similar concepts. It is applied to one of the most attribute-rich hierarchies, Procedure, from which a random sample of 60 sets is derived. These sets are examined in regard to hierarchical, definitional, attribute, attribute/value, and role-group aspects. Thirty percent of the sample sets were found to have at least one type of modeling inconsistency. Their presence may interfere with the performance of terminology-driven applications. With the use of SNOMED expanding, such inconsistencies may eventually affect clinical care. Due to this, external auditing should be encouraged to identify such issues and complement IHTSDO’s efforts.



Impact of Climate Change on Coastal Water Quality - A Remote Sensing Approach(Link)

Ocean color data acquired by current satellites - SeaWIFS and MODIS with the synoptic coverage can give information about the oceans/coastal zones which play an important role in the exchange of carbon dioxide between ocean and atmosphere. Increases in phytoplankton composition and abundance due to global climate change can contribute to important changes in water quality conditions. This is considered to be a major pathway of carbon cycling in the ocean and thus essential to global change studies. Since phytoplankton depends upon specific conditions for growth, they frequently become the first indicator of a change in their environment. Phytoplankton, as revealed by ocean color, frequently show scientists where ocean currents provide nutrients for plant growth and where subtle changes in the climate-warmer or colder more saline or less saline-affect phytoplankton growth. With increasingly sophisticated sensors, better data and improved algorithms, water quality parameters - phytoplankton can be accurately determined using ocean color data.
The intent is to test the utility of multi sensor / multi temporal data in visualization and future integration with biooptical/statistical methods to enhance our understanding of ecosystem responses and estimation of water quality conditions to manage eutrophication. The long term goal is to ultimately link global scale processes with local environmental and resource problems to assess the impacts of climate change on freshwater/estuarine systems in NY/NJ metropolitan areas.

Team members:

Remote Sensing Application in Optically Complex Estuary -- Hudson/Raritan Estuary(Link)

The presence of several non-correlated constituents makes estuarine (case2) waters optically complex than most oceanic (case1) waters. The color of case1 waters quantified by the subsurface reflectance is a function of absorption and scattering by algal pigments, algal detritus and water itself. In case2 waters, the combined effects of particulate backscattering and high absorption introduce complex interacting relations between the water constituents and subsurface reflectance. Therefore, retrieval of water constituents concentrations from the remote sensing data requires an analytical approach.
A bio optical model was used to predict the subsurface reflectance (Gordon et al. 1975). This model was validated with the measured reflectance spectra–OL754 and the atmospherically corrected AVIRIS data for retrieval of water constituents (i.e., CHL-a) concentration. Modeling of reflectance is a prerequisite for processing remote sensing data to desired thematic maps depicting spatial distributions of algal blooms and suspended solids. These maps are key input parameters into the Geographic Information System (GIS) for monitoring and management of water quality conditions and provide a baseline on the characteristics of algal blooms important in global change studies.

Team members:


Finds you a date based on your interest, programming in Java using Oracle database


MatchFind is a software tool which uses the BlockMatch tool to align and compare an RNA block with those in the database. It then displays the most closely matching RNA block names along with the match scores.

The user input to MatchFind can be given in two ways:
1. Select one of the input query block from the given drop down menu.
2. Input an RNA block strictly in the Stockholm format in the given textbox.


ABook was an Address Book created using JAVA Swing with Oracle as the Backend.


  • Data Mining
  • Programming
  • Java
  • C++
  • Python
  • Matlab
  • Databases
  • Ontologies
  • Data Analysis
  • Machine Learning
  • Teaching
  • Research
  • SQL
  • MySQL
  • Text Mining
  • Leadership
  • Software Engineering
  • Perl
  • Oracle
  • Linux
  • Algorithms
  • Simulations
  • Conceptual Modeling
  • Auditing
  • Lexical Semantics
  • Medical Terminology
  • Knowledge Representation
  • OOP
  • C
  • Computer Science
  • See 16+  See less

Volunteer Experience & Causes

Opportunities Ankur is looking for:

  • Joining a nonprofit board
  • Skills-based volunteering (pro bono consulting)

Causes Ankur cares about:

  • Education
  • Environment
  • Health
  • Human Rights
  • Poverty Alleviation
  • Science and Technology
  • Social Services

Organizations Ankur supports:

View Ankur's full profile to...

  • See who you know in common
  • Get introduced
  • Contact Ankur directly

Not the Ankur Agrawal you're looking for? View more


People Also Viewed

LinkedIn member directory:

  1. a
  2. b
  3. c
  4. d
  5. e
  6. f
  7. g
  8. h
  9. i
  10. j
  11. k
  12. l
  13. m
  14. n
  15. o
  16. p
  17. q
  18. r
  19. s
  20. t
  21. u
  22. v
  23. w
  24. x
  25. y
  26. z
  27. more