LinkedInSiddharth Anand

Siddharth Anand

Chief Architect at ClipMine, Inc.

San Francisco Bay Area
  1. ClipMine, Inc.,
  2. Niara Inc.,
  3. BrightFunnel
  1. LinkedIn,
  2. Netflix,
  3. Etsy
  1. Cornell University
Recommendations16 people have recommended Siddharth

Join LinkedIn & access Siddharth's full profile

Join LinkedIn & access Siddharth's full profile. It's free!

As a LinkedIn member, you'll join 300 million other professionals who are sharing connections, ideas, and opportunities.

  • See who you know in common
  • Get introduced
  • Contact Siddharth directly
ClipMine, Inc.

ClipMine, Inc.

Chief Architect

– Present

View full profile



I'm a hands-on software architect with deep experience building and scaling data infrastructure at high-traffic web sites.

I'm currently the Chief Architect at ClipMine, a video mining and search company. Prior to joining ClipMine, I held technical and leadership positions at LinkedIn, Netflix, Etsy, eBay, and Siebel Systems.

Outside of work, I help venture-backed startups build scalable systems.

I also enjoy speaking at a few conference every year, including OSCON, GigaOM Structure, Hadoop Summit, QCon (SF/London/NYC), etc...


Chief Architect

ClipMine, Inc.
– Present (2 months)San Francisco Bay Area

Building a video mining, search, and watch-experience startup!

Technical Advisor

Niara Inc.
– Present (9 months)San Francisco Bay Area

Technical Advisor

– Present (1 year 7 months)San Francisco Bay Area


– Present (1 year 10 months)San Francisco Bay Area

Member of the Program Committee for QCon SF

Technical Lead (Search Infrastructure)

(1 year 7 months)

I build search systems at scale at LinkedIn
• Helped create LinkedIn's new search infrastructure (Galene)
• Led the search Indexing team, responsible for building offline (Hadoop) and near-realtime (Databus, Samza, Kafka) indexes
• Developed search typeahead (autocomplete) services. Typeahead comes in 2 flavors: Graph typeahead (e.g. members) and regular typeahead (e.g. companies, groups, schools, skills)

Technical Lead (Analytics Platform)

(1 year 4 months)Mountain View, CA

The Analytics Platform (AP) team at LinkedIn is responsible for making online and offline data available for analysis. The data typically originates from OLTP databases (e.g. RDBMS, NoSQL), application event streams (e.g. Click Tracking), and from machine learning algorithms. AP will make this data available for Hive queries, Pig jobs, and normal M/R jobs as well as for ETL engineers to load into a traditional Data Warehouse
• Lead LISTT (LinkedIn Segmentation & Targeting Tool). LISTT provides a self-service tool that
marketing operations can use to target LinkedIn members for online marketing needs. It does this by leveraging both Hive and Pig to materialize a large table in Hadoop. This table is then converted into multiple formats : Teradata load-ready format & Lucene indexes for a custom search application
• Presented LISTT at Hadoop Summit 2013

Cloud Database Architect

(3 years)San Francisco Bay Area

In January 2009, the Cloud Systems team was formed to pioneer Netflix’s migration to a new cloud-based architecture, thereby extending availability and scalability. By EOY 2010, 90% of our traffic was served out of AWS.
• Part of a 5-person team to design and build Netflix’s cloud-based architecture
• Responsible for Netflix’s Cloud-based Database Architecture (e.g. Cassandra, SimpleDB, S3, etc...)
• Worked closely with AWS to align their product roadmap with our needs
• Evangelized NoSQL, Data Replication, & Cloud Best Practices internally and externally
• Authored a white paper titled “Netflix’s Transition to High-Availability Storage Systems”
• Acted as a Netflix Crisis Manager – periodically lead the resolution of critical company-wide production emergencies
• Created a multi-master, hybrid Oracle-NoSQL system to manage our largest data sets – my replication framework dealt with billions of records and copied incremental changes with latencies averaging 5 seconds – patent : United States 2011/103537

Software Architect

(1 year 6 months)San Francisco Bay Area

As a member of the Software Infrastructure team, I helped define, design, and implement new architectures – all Netflix systems run on our architecture. I also solved critical performance problems, resulting in cost reduction and service improvement.

• Led the identification and resolution of various Denial-of-Service exploits. Evangelized DoS prevention.
• Found and eliminated a critical performance bug in the streaming PC player – this fix reduced DB traffic by 50% to 2 key tables, reducing our need to vertically scale the database
• Increased farm-wide memory headroom by 10% by eliminating the use of a Java Finalizer
• Created Netflix’s Session Manager to deliver consistently fast user response times under high traffic
• Created Netflix’s web request processing framework to improve developer productivity and code robustness
• Created a deadlock recovery system to detect production deadlocks early and take preemptive action before end users could be affected
• Invented 2 internal performance optimization tools (i.e. Tracer Central & Tracer Regression Central) – Netflix Engineering relies on these tools to understand traffic growth and code/site performance

VP of Engineering

(4 months)San Francisco Bay Area

• Managed a direct budget of $1.5M and built an engineering team of 16 engineers to create the site
• Designed and wrote a real-time application logging and analytic application

Senior Software Engineer (Search Engine)

(9 months)San Francisco Bay Area

As a member of eBay’s Search Engine team, I was tasked with building new search services.
• With help from a co-worker, I built eBay’s first search engine for buyer behavior, central to eBay’s default search sort (a.k.a. the Best Match sorting algorithm)
• Implemented a search service to find (fuzzy) near matches by user id, first name, last name, or full name

Senior Software Engineer (Research Labs)

(8 months)San Francisco Bay Area

As a member of the Ebay Research Labs, I had the opportunity to work on various early-stage prototypes:
• Designed and implemented a novel P2P version of eBay (Confidential)
• Worked with a colleague to build a WYSIWYG eBay Store builder leveraging browser-side technologies

Senior Software Engineer (eBay Stores)

(2 years 2 months)San Francisco Bay Area

My career at eBay started as a founding member of the Stores & Merchandising team. During my tenure with this team, I led projects touching functionality across all areas of the site.
• Led projects that touched Search, MyeBay, Selling, Buying, Sign-on, API, etc...
• Proposed & led a project to re-architect the eBay subscriptions framework
• Submitted 5 innovation ideas centered around uses of AJAX on the site – all 5 ideas adopted

Software Engineer (Platform Scalability)

Siebel Systems
(1 year 7 months)San Francisco Bay Area

• Analyzed system scalability and performance bottlenecks
• Solved critical customer performance issues during a 2 month-loan to Siebel Expert Services



United States 2011/103537
Filed February 21, 2011

Methods, systems, and articles for simultaneously maintaining copies of data in a data center and a cloud computing environment providing network based services. Synchronizing applications monitor modifications to data records made in the data center and the cloud computing environment. The synchronizing applications are also configured to convert modified records from the data center into a format compatible with databases in the cloud computing environment prior to updating the databases in the cloud computing environment, and vice versa.


System and method for building a point-in-time snapshot of an eventually-consistent data store(Link)

United States US20130218840 A1
Filed February 17, 2012

A method and system for building a point-in-time snapshot of an eventually-consistent data store. The data store includes key-value pairs stored on a plurality of storage nodes. In one embodiment, the data store is implemented as an Apache® Cassandra database running in the “cloud.” The data store includes a journaling mechanism that stores journals (i.e., inconsistent snapshots) of the data store on each node at various intervals. In Cassandra, these snapshots are sorted string tables that may be copied to a back-up storage location. A cluster of processing nodes may retrieve and resolve the inconsistent snapshots to generate a point-in-time snapshot of the data store corresponding to a lagging consistency point. In addition, the point-in-time snapshot may be updated as any new inconsistent snapshots are generated by the data store such that the lagging consistency point associated with the updated point-in-time snapshot is more recent.



  • Distributed Systems
  • Scalability
  • High Performance...
  • Cloud Computing
  • Hadoop
  • Analytics
  • Apache Pig
  • Cassandra
  • Lucene
  • AWS
  • Java
  • Web Development
  • Hive
  • NoSQL
  • Algorithms
  • Data Warehousing
  • Public Speaking
  • Machine Learning
  • Python
  • Tomcat
  • Amazon Web Services...
  • Architecture
  • REST
  • System Architecture
  • Amazon EC2
  • Software Engineering
  • Agile Methodologies
  • Databases
  • Integration
  • Big Data
  • Git
  • Maven
  • Perl
  • Ruby
  • AJAX
  • MapReduce
  • JUnit
  • Spring
  • Open Source
  • Software Development
  • See 27+  See less


Cornell University

M.S., Computer Science

GPA = 3.8

Cornell University

B.S, Materials Science & Engineering

CS/EE GPA = 3.8

The American School of Paris

I.B., International Baccalaureate (H.S. degree)

Activities and Societies: Math Team, Varsity Soccer Team


  1. French

  2. English


Cornell University

  • Artificial Intelligence
  • Operating Systems
  • Intermediate Systems
  • Information Retrieval
  • Systems Security
  • Computer Networks

View Siddharth's full profile to...

  • See who you know in common
  • Get introduced
  • Contact Siddharth directly

Not the Siddharth Anand you're looking for? View more


People Also Viewed

LinkedIn member directory:

  1. a
  2. b
  3. c
  4. d
  5. e
  6. f
  7. g
  8. h
  9. i
  10. j
  11. k
  12. l
  13. m
  14. n
  15. o
  16. p
  17. q
  18. r
  19. s
  20. t
  21. u
  22. v
  23. w
  24. x
  25. y
  26. z
  27. more