Data mining, machine learning, anti-spam, bioinformatics
San Francisco Bay Area
Data mining, machine learning, anti-spam, bioinformatics
San Francisco Bay Area
Over ten years of experience in data mining and analysis, informatics, and software development, working for public, private, and non-profit firms. Technical application lead of a multi-million dollar project. Experience in machine learning, anti-spam, and distributed computing, working directly with customers and users to make their jobs easier.
Machine learning, data mining, data analysis, anti-spam, e-mail, dataflow, bioinformatics, Linux, Ruby on Rails, Perl, distributed computing, startups, workflow design
(Privately Held; 11-50 employees; Internet industry)
September 2008 — Present (11 months)
I'm building data mining and content analysis tools to help create the world's most advanced and usable local search.
(Privately Held; 11-50 employees; Internet industry)
November 2006 — September 2008 (1 year 11 months)
Developed and automated tools for content insertion, analysis, and publication.
(Privately Held; Computer Software industry)
January 2005 — November 2006 (1 year 11 months)
Designed and implemented new classifier features and improvements to data architecture.
Machine learning combined with rapid response to emerging spam campaigns increased classifier effectiveness and accuracy.
(Privately Held; Computer Software industry)
January 2004 — January 2005 (1 year 1 month)
Data mining and updates of Proofpoint's anti-spam classifier models.
Generated new features based on emerging spam attacks.
Analyzed and captured spam messages for update process.
Worked with customers to stop emerging spam campaigns.
(Public Company; INCY; Biotechnology industry)
July 2001 — November 2002 (1 year 5 months)
Managed Unix and Linux accounts for over 100 users in a production bioinformatics, dataflow, and data analysis environment.
Administered production Unix and Linux machines.
Configured Platform LSF distributed computing cluster of over 1,000 machines. Automated cluster status reporting.
(Public Company; INCY; Biotechnology industry)
June 1999 — July 2001 (2 years 2 months)
Technical application lead on Linux farm project, which was both used in-house and sold to outside customers. Co-designed, wrote, documented, and maintained job distribution software, as well as a parallel application launcher, for several clusters of Pentium computers, controlling over 2,000 CPUs in total.
The project saved Incyte several million dollars in hardware costs by moving the majority of data processing from expensive servers to much cheaper PCs.
(Public Company; INCY; Biotechnology industry)
November 1997 — June 1999 (1 year 8 months)
Ran dataflow for Incyte's plant database, defining schedules between sequencing, product science, and legal departments. Screened and annotated several hundred thousand sequences, analyzing and preparing them for release, meeting an aggressive schedule.
Wrote wrapper scripts to automate dataflow.
Benchmarked different hardware platforms to evaluate cost/performance. This work led to the Linux farm project.
(Non-Profit; Biotechnology industry)
June 1994 — August 1997 (3 years 3 months)
Annotated and curated Genome Sequence DataBase (GSDB), a public DNA sequence database. GSDB was a major project of the non-profit National Center for Genome Resources (NCGR).
Served as primary contact for four large genome centers, processing large and multiple sequence submissions into the relational database.
Tested and evaluated GSDB Annotator software.
(Government Agency; Research industry)
August 1993 — June 1994 (11 months)
Trained offsite users in DNA sequence file format, answering detailed questions about data insertion.
Assured data quality by checking submissions for correct syntax and accurate biological information.
B.A. , Liberal Arts , 1991 — 1993
Liberal Arts 1989 — 1991
programming, scripting, Ruby on Rails, Ruby, Perl, Linux, distributed computing, events, search, startups, bioinformatics, genome, genomics, machine learning, anti-spam, GTD, environment, sustainability, ecological footprint, renewable energy, bicycling, ethics, free speech, free culture