
Data Scientist @ LinkedIn
San Francisco Bay Area

Data Scientist @ LinkedIn
San Francisco Bay Area
LinkedIn Analytics - Building data-driven products (recommender systems , job matching, fraud detection etc.)
Large-scale applied text mining and data mining, including domain-specific applications, multi-lingual information access and organization, evaluation design, statistical methods for text processing, text classification, medical applications, applied machine learning, statistical natural language processing, information retrieval, event detection and tracking, machine translation, personal finance, tomatoes
(Privately Held; Internet industry)
May 2008 — Present (1 year 7 months)
LinkedIn Analytics - Building data-driven products (recommender systems , job matching, fraud detection etc.)
(Educational Institution; Research industry)
September 2000 — May 2008 (7 years 9 months)
Advisors: Yiming Yang and Jaime Carbonell. Examined feature selection for text classification and cross-lingual information retrieval using parallel corpora, including full system implementation. Participated in CLEF 2001 and 2003. Supervised two undergraduates working on web mining for CLIR resources. ♦ Supervised an MS student working on Named Entities for Topic-Conditioned Event Tracking. ♦ Thesis work on domain adaptation of translation models, as applied to Cross-Language Information Retrieval, Topic Conditioned Event Tracking and Machine Translation. ♦ Crucial role in the CMU GALE Distillation group: proposal writing with advisors, end-to-end system design and specification, multi-site coordinator (CMU, UPitt, IBM), user study coordination and design with UPitt, evaluation dataset design.
(Information Technology and Services industry)
2006 — 2007 (1 year )
Sub-contracted projects for CMU GALE Distillation group
(Public Company; Information Technology and Services industry)
May 2005 — August 2005 (4 months)
(Public Company; Information Technology and Services industry)
May 2003 — August 2003 (4 months)
Supervisor: Scott McCarley. Unsupervised learning for Arabic stemming. Work published: ACL'03.
(Public Company; Information Technology and Services industry)
May 2000 — August 2000 (4 months)
Supervisor: Marilyn Walker. Worked towards improving the DARPA Communicator by treating sentence planning as a stochastic learning problem. Work published : NAACL'01, ACL'01.
(Higher Education industry)
August 1997 — December 1999 (2 years 5 months)
Teaching Assistant, UNM CS Department (1998)
Supervisor: Barak Pearlmutter. Responsible for holding a weekly recitation section for a Non-Imperative programming (Scheme) class. Duties included grading, lecturing, holding office hours and assisting with exam questions. Developed an automatic grader program to evaluate student projects and e-mail students a progress report.
Software Developer, UNM CS Department (1998)
Leader of a two-person team in charge of developing a web-based grade database (SQL, C++) to be used for several courses in the Computer Science Department, later used by the three largest classes in the department.
Recitation Instructor, UNM CS Department (1997)
Held a weekly recitation section for a beginning C++ based Computer Science class.
(Information Technology and Services industry)
1996 — 1999 (3 years )
Developed small logic design projects, including a packetizing chip software simulation and a multiplexer usage optimization for multiple outputs boolean functions; disassembled and redesigned a microcontroller test bench.
Responsible for student assignments evaluation for a logic design class.
Ph.D. , Computer Science , 2003 — 2008
Thesis:
Domain Adaptation of Translation Models for Multilingual Applications
Supervisors: Yiming Yang and Jaime Carbonell.
Examined feature selection for text classification and cross-lingual information retrieval using parallel corpora, including full system implementation. Participated in CLEF 2001 and 2003. Supervised two undergraduates working on web mining for CLIR resources. ♦ Supervised an MS student working on Named Entities for Topic-Conditioned Event Tracking. ♦ Thesis work on domain adaptation of translation models, as applied to Cross-Language Information Retrieval, Topic Conditioned Event Tracking and Machine Translation. ♦ Crucial role in the CMU GALE Distillation group: proposal writing with advisors, end-to-end system design and specification, multi-site coordinator (CMU, UPitt, IBM), user study coordination and design with UPitt, evaluation dataset design.
MS , Computer Science , 2000 — 2003
Teaching Assistant, CMU CS Department (2002)
Supervisors: Daniel Sleator and William Scherlis. Responsible for holding a weekly recitation section for a sophomore-level class in Data Structures and Algorithms (using Java). Duties included grading, lecturing (150+ class), designing exam questions, designing and testing assignments.
BS , Computer Science , 1996 — 2000
First in my graduating class.
Research Assistant, UNM CS Department (1999-2000)
Supervisor: George Luger. Designed a disambiguation program for natural language, using a context-sensitive deformable semantic network (in retrospect, a very naïve rediscovery of language modeling).
Supervisor: Barak Pearlmutter. Examined hidden unit modulation by attentional focus in neural networks.
Teaching Assistant, UNM CS Department (1998)
Supervisor: Barak Pearlmutter. Responsible for holding a weekly recitation section for a Non-Imperative programming (Scheme) class. Duties included grading, lecturing, holding office hours and assisting with exam questions. Developed an automatic grader program to evaluate student projects and e-mail students a progress report.
Computer Science
Large-scale applied text mining and data mining, including domain-specific applications, multi-lingual information access and organization, evaluation design, statistical methods for text processing, text classification, finance and medical applications, applied machine learning, statistical natural language processing, information retrieval, event detection and tracking, machine translation.
CMU, SIGIR
Outstanding Junior of the Year 97-98
Outstanding Senior of the Year 98-99
CRA Outstanding Undergraduate 2000 Honorable Mention
UNM university-wide commencement speaker, Dec. '99
Regional ACM programming contest: 1st place in 1997, 2nd place in 1998
Microelectronics Research Center scholarship (all expenses, 4 years)