Edward Capriolo

Edward Capriolo

Data Architect at The Huffington Post

Location
Greater New York City Area
Industry
Computer Software

As a LinkedIn member, you'll join 300 million other professionals who are sharing connections, ideas, and opportunities.

  • See who you and Edward Capriolo know in common
  • Get introduced to Edward Capriolo
  • Contact Edward Capriolo directly

View Edward's full profile

Edward Capriolo's Overview

Current
Past
  • Programmer at Savvy Networks
  • Intern at FDTA CITE
Education
Connections

320 connections

Websites

Edward Capriolo's Summary

Early adopter of several open source big data systems. In depth knowledge of Hadoop Map Reduce API. One of the longest active members of the Apache Hive project as both a committer and a (PMC) Project Management Committee. Expertise in designing scalable data stores using the Apache Cassandra NoSQL database.

Two time author on big data topics wrote High Performance Cassandra Cookbook, and co-authored Programming hive. Spoken on a variety of Big Data, NoSql, and Ad Tech topics across the country (Hadoop World, Cassandra NYC, etc, etc)

Edward Capriolo's Experience

Data Architect

The Huffington Post

Public Company; 201-500 employees; Online Media industry

February 2014Present (8 months) New York, New York

All that data just can't real time analyse itself now can it?

Also as always: Apache Hive Project Management Committee, Author, Big data aficionado

Software Developer

Media6Degrees

Privately Held; 51-200 employees; Internet industry

March 2010February 2014 (4 years)

Designed and implemented:
System for replication of Hadoop and Hive data between separate hadoop clusters.
System for automated retention policies of Hive data.
Numerous batch processing and ETL processes using Hadoop, Hive, Mysql.
An open source input format to query raw protobuf data from hive.
An open source tool (Hadoop-filecrush) to optimize hadoop data on disk for better query performance and name node memory savings.
Schemas for NoSQL database (Apache Cassadra)
Performance tuning and ongoing management of Apache Cassandra datastores (TB of data billions of requests)
Puppet across the company to manage all system installations.
Customized graphite implementation to publish application specific counters.
Countless features to core adserving applications.
System for backup, change management, and differential based emails for data stored in Apache zookeeper

System Operations

About.com

Public Company; 51-200 employees; NYT; Internet industry

February 2008February 2010 (2 years 1 month)

I was the guy who sent out the emails on bagel thursdays to let everyone know.

Senior Network Manager

InfoDesk Inc

20052007 (2 years)

Installed Solaris 10 on machines with 400hmz processors, had to sit servers on top of each other because rails were not in the budget. Sad true story.

Programmer

Savvy Networks

20022004 (2 years)

Sat near a very noisy server with a broken loud fan for over a year. The server finally died, but by then I was already quite mad you see.

Intern

FDTA CITE

20012003 (2 years)

Worked on the web site. This was back in the day when you were not supposed to use java script because it never worked well on different browsers. (JavaScript still does not work well but now your supposed to use it to be cool)

Edward Capriolo's Courses

  • AS, Computer Science

    Westchester Community College

    • Computer Science

Edward Capriolo's Skills & Expertise

  1. Hadoop
  2. Awk
  3. Vim
  4. Java
  5. Hive
  6. Linux
  7. Cassandra
  8. MapReduce
  9. Apache
  10. Perl
  11. Unix
  12. JavaScript
  13. Python
  14. Big Data
  15. MySQL
  16. Distributed Systems
  17. NoSQL
  18. HTML
  19. Nagios
  20. JSP
  21. Servlets
  22. HBase
  23. Shell Scripting
  24. Solaris
  25. Puppet
  26. SQL

View All (26) Skills View Fewer Skills

Edward Capriolo's Publications

  • Cassandra High Performance Cookbook

    • Packt Publishing
    • July 15, 2011
    Authors: Edward Capriolo

    This book is designed for administrators, developers, and data architects who are interested in Apache Cassandra for redundant, highly performing, and scalable data storage.

  • Programming hive

    • O'Reilly publishing
    • August 1, 2012
    Authors: Edward Capriolo

    Hive data warehouse and query language for hadoop. Don't struggle reading through source code, random blogs, reading mailing lists post, or the often incomplete apache wiki. Just get this book and get "Big Data". (Will be published septemberish)

Edward Capriolo's Education

Westchester Community College

AS, Computer Science

19992001

Activities and Societies: Omega Computer Club President

Pace University

BS, Computer Science

2002

Just putting it out there that I am ~ 15 credits away from a BS so in case you want to make me a CEO you wont later fire me because linked in is unclear about this.

Edward Capriolo's Additional Information

Websites:
Interests:

Java, Linux, Computer Networking, Hadoop, Apache Cassandra, clojure

Groups and Associations:
Honors and Awards:

CCNA

Contact Edward for:

  • getting back in touch

View Edward Capriolo’s full profile to...

  • See who you and Edward Capriolo know in common
  • Get introduced to Edward Capriolo
  • Contact Edward Capriolo directly

View Edward's full profile

Viewers of this profile also viewed...