Consultor Independiente en Tecnologia
- Greater Detroit Area
- Computer Software
Fernando Farfan's Overview
3 people have recommended Fernando
Fernando Farfan's Summary
I am currently a Software Engineer at Compendia Bioscience, a company devoted to help researchers to find a cure to cancer. At Compendia, we follow an agile methodology to develop software tools that help internal and external customers in this mission.
I was a postdoctoral Research Associate at the University of Michigan, where I applied my data management knowledge and experience into the integration of known genetic information, as found in databases, to perform a more effective analysis of gene expression data from micro-array experiments. As a fruit of this research, we developed THINK-Back, a set of tools for knowledge-based interpretation of high-throughput biological data
I received my PhD degree in Computer Science from Florida International University. My areas of focus for my doctoral dissertation were Databases, Information Retrieval and Storage Systems, with particular focus on XML and other semistructured data formats. I have published several papers in the areas of Databases and Information Retrieval, in top-ranked conferences and journals such as IEEE ICDE, DEXA and Elsevier's Information Systems.
Before joining FIU, I worked at Banrural, a financial institution in Guatemala. I led a team of software developers in charge of designing, implementing and maintaining desktop and web applications for the bank's internal users and customers in general. One of my largest achievements was to lead the team that developed the online banking portal, along with a robust tax collecting solution, using XML messaging and web services to work online with the national revenue service. Our bank was the first in the country to offer this service, even though our development started several months after other banks with larger IT resources (mainly human resources).
Fernando Farfan's Languages
English(Native or bilingual proficiency)
Spanish(Native or bilingual proficiency)
Fernando Farfan's Skills & Expertise
Fernando Farfan's Publications
- BMC Bioinformatics 2012, 13(Suppl 2):S4
- March 13, 2012
Results of high throughput experiments can be challenging to interpret. Current approaches have relied on bulk processing the set of expression levels, in conjunction with easily obtained external evidence, such as co-occurrence. While such techniques can be used to reason probabilistically, they are not designed to shed light on what any individual gene, or a network of genes acting together, may be doing. Our belief is that today we have the information extraction ability and the computational power to perform more sophisticated analyses that consider the individual situation of each gene. The use of such techniques should lead to qualitatively superior results.
The specific aim of this project is to develop computational techniques to generate a small number of biologically meaningful hypotheses based on observed results from high throughput microarray experiments, gene sequences, and next-generation sequences. Through the use of relevant known biomedical knowledge, as represented in published literature and public databases, we can generate meaningful hypotheses that will aide biologists to interpret their experimental data.
We are currently developing novel approaches that exploit the rich information encapsulated in biological pathway graphs. Our methods perform a thorough and rigorous analysis of biological pathways, using complex factors such as the topology of the pathway graph and the frequency in which genes appear on different pathways, to provide more meaningful hypotheses to describe the biological phenomena captured by high throughput experiments, when compared to other existing methods that only consider partial information captured by biological pathways.
Authors: Fernando Farfan, Vagelis Hristidis (aka Evangelos Christidis), Eduardo Ruiz, Alejandro Hernandez, Ramakrishna Varadarajan
- 3rd International Workshop on Patent Information Retrieval PaIR 2010, at ACM CIKM 2010.
There is an abundance of systems today to search for relevant patents, ranging from free ones like Google Patents (google.com/patents) to subscription ones like Delphion (delphion.com). After studying many existing systems, we found that they all apply general-purpose Information Retrieval (IR) techniques to rank patents. We argue that the quality of search can be significantly improved by exploiting the domain semantics: e.g., patents are organized into classes and subclasses, and have links to external publication and to other patents. Also patents' text is organized into various sections and uses specific legal wording.
We present the PatentsSearcher system, available at PatentsSearcher.com, whose key contribution is to leverage the domain semantics to improve the quality of discovery and ranking. PatentsSearcher also offers other novel functionalities to help users locate and navigate relevant and important patents or applications.
Authors: Fernando Farfan, Vagelis Hristidis (aka Evangelos Christidis), Anand Ranganathan, Michael Weiner
- Proceedings of the IEEE International Conference on Data Engineering, ICDE 2009.
As the use of Electronic Medical Records (EMRs) becomes more widespread, so does the need for effective information discovery within them. Recently proposed EMR standards are XML-based. A key characteristic in these standards is the frequent use of ontological references, i.e., ontological concept codes appear as XML elements and are used to associate portions of the EMR document with concepts dened in a domain ontology.
A rich corpus of work addresses searching XML documents. Unfortunately, these works do not make use of ontological references to enhance search. In this paper we present the XOntoRank system which addresses the problem of ontology-aware keyword search of XML documents with a particular focus on EMR XML documents. Our current prototypes and experiments use the Health Level Seven (HL7) Clinical Document Architecture (CDA) Release 2.0 standard of EMR representation and the Systematized Nomenclature of Human and Veterinary Medicine (SNOMED) ontology, although the presented techniques and results are applicable to any EMR hierarchical format and any ontology that denes concepts and relationships.
- Information Systems
- January 16, 2013
Systems that produce ranked lists of results are abundant. For instance, Web search engines return ranked lists of Web pages. There has been work on distance measure for list permutations, like Kendall tau and Spearman's Footrule, as well as extensions to handle top-k lists, which are more common in practice. In addition to ranking whole objects, there is an increasing number of systems that provide keyword search on XML or other semistructured data, and produce ranked lists of XML sub-trees. Unfortunately, previous distance measures are not suitable for ranked lists of sub-trees since they do not account for the possible overlap between the returned sub-trees. That is, two sub-trees differing by a single node would be considered separate objects. In this paper, we present the first distance measures for ranked lists of sub-trees, and show under what conditions these measures are metrics. Furthermore, we present algorithms to efficiently compute these distance measures. Finally, we evaluate and compare the proposed measures on real data using three popular XML keyword proximity search systems.
Contact Fernando for:
- career opportunities
- consulting offers
- new ventures
- expertise requests
- reference requests
- getting back in touch