Senior Research Associate - Data Manager at University College London, Clinical Epidemiology Group
London, United Kingdom
Senior Research Associate - Data Manager at University College London, Clinical Epidemiology Group
London, United Kingdom
text mining, behavioral and usage metrics, natural language processing, data warehousing and ETL, data clustering, information retrieval and indexing technology
(Educational Institution; Research industry)
April 2009 — Present (9 months)
* Data manager / data curator for the Cardiovascular disease research Linking Bespoke studies and Electronic Records (CALIBER) consortium.
* Data manager for the Optimising Management of Angina (OMA) study (http://omastudy.org.uk/)
* Design and implementation of a cluster randomized controlled trial in open access chest pain clinics across the UK.
* Clinical decision support systems.
(Privately Held; Internet industry)
May 2006 — April 2009 (3 years )
Lokku builds classifieds vertical search engines. Our first consumer brand is Nestoria, a property vertical search engine with presence in the United Kingdom, Germany, Spain and Italy.
• Work directly with data providers around the world coordinating data acquisition, validation and integration.
• Coordinate data collection, cleansing and integration with third-party technical teams.
• Developed core parts of an aggregated usage and behavior metrics analysis platform including a dedicated reporting system.
• Designed and implemented an Information Extraction (IE) system for processing raw textual data.
• Developed core parts of an Extract Transform Load (ETL) system handling over 7M items per day.
• Worked on Geographical Information Systems (GIS), geocoding, information clustering and nearness.
• Designed a click fraud detection platform.
(International Trade and Development industry)
January 2005 — May 2006 (1 year 5 months)
Consulting on several topics including but not limited to new hardware acquisition, software licensing and security.
PhD , Computer Science / Bioinformatics , 2004 — 2008
Dissertation : A novel framework for integrating a priori domain knowledge into traditional data analysis in the context of bioinformatics.
Focal interests: data clustering, cluster validation metrics, vector space model, ontologies, knowledge-driven data analysis approaches, information retrieval, biomedical databases, statistical natural language processing and tackling the challenges incurred by raw biological text.
MSc , Information Systems Engineering , 2003 — 2004
Focal interests: Web mining, intelligent link analysis.
BSc , Computer science , 2000 — 2003
swimming pool engineering
IEEE student member