Christian Engelmann

Christian Engelmann

Research and Development Staff Member at Oak Ridge National Laboratory

Knoxville, Tennessee Area

Current
Past
Education
  • The University of Reading
  • The University of Reading
  • Fachhochschule für Technik und Wirtschaft Berlin
Connections
105 connections
Industry
Research
Websites

Christian Engelmann’s Summary

Dr. Christian Engelmann has 8+ years of extensive experience in software research and development for next-generation extreme-scale high-performance computing (HPC) systems with a strong research funding and publication record. In collaboration with other U.S. Department of Energy laboratories and universities world-wide, his research aims at computer science challenges in HPC system software, such as dependability, scalability, and portability.

Christian Engelmann’s Specialties:

principal investigator for federally funded software research and development; MSc thesis advisor; small team lead; project- and milestone-oriented team work; publishing research and development results


Christian Engelmann’s Experience

  • Research and Development Staff

    Oak Ridge National Laboratory

    (Government Agency; Research industry)

    September 2009Present (3 months)

    2009-...: Checkpoint storage virtualization to improve efficiency by aggregating a variety of resources, such as memory and flash, and software dual-modular redundancy (DMR) to eliminate rollback/recovery in HPC

    2008-...: Light-weight simulation of future HPC architectures (~10,000,000 cores) to evaluate scalability/fault tolerance of key science algorithms [U.S. DOE Institute for Advanced Architecture and Algorithms]

    2008-...: HPC system software resiliency solutions, including health monitoring, reliability analysis, fault prediction, proactive fault tolerance, reactive fault tolerance enhancements, and holistic fault tolerance

  • Research and Development Associate

    Oak Ridge National Laboratory

    (Government Agency; Research industry)

    May 2004August 2009 (5 years 4 months)

    2008-...: Light-weight simulation of future HPC architectures (~10,000,000 cores) to evaluate scalability/fault tolerance of key science algorithms [U.S. DOE Institute for Advanced Architecture and Algorithms]

    2008-...: HPC system software resiliency solutions, including health monitoring, reliability analysis, fault prediction, proactive fault tolerance, reactive fault tolerance enhancements, and holistic fault tolerance

    2006-09: Enhancing productivity for scientific application development, deployment and execution with the Harness Workbench Toolkit offering a common view across diverse HPC hardware and software platforms

    2006-08: Virtual system environments for "plug-and-play" supercomputing through desktop-to-cluster-to-petaflop computer system-level virtualization based on recent advances in hypervisor technologies

    2004-07: HPC reliability, availability and serviceability solutions, such as scalable membership management for MPI high availability and asymmetric active/standby (n+m) replication for head and service nodes

    2004-06: High availability for services running on HPC head and service nodes, such as Torque and PVFS MDS, using symmetric active/active (state-machine) replication with 99.9997% service uptime

    2000-05: Pluggable lightweight heterogeneous distributed virtual machine (PVM successor) with an adaptive reconfigurable runtime environment, parallel plug-in paradigms, high availability, and fault-tolerant MPI

  • Post Master's Research Associate

    Oak Ridge National Laboratory

    (Government Agency; Research industry)

    June 2001April 2004 (2 years 11 months)

    2002-04: Light-weight simulation of future HPC architectures (~1,000,000 processors) to evaluate scalability/fault tolerance of a new generation of super-scalable, naturally fault-tolerant scientific algorithms [IBM CRADA]

    2000-05: Pluggable lightweight heterogeneous distributed virtual machine (PVM successor) with an adaptive reconfigurable runtime environment, parallel plug-in paradigms, high availability, and fault-tolerant MPI

  • Software Developer

    Oak Ridge National Laboratory

    (Government Agency; Research industry)

    August 2000January 2001 (6 months)

    2000-05: Pluggable lightweight heterogeneous distributed virtual machine (PVM successor) with an adaptive reconfigurable runtime environment, parallel plug-in paradigms, high availability, and fault-tolerant MPI

  • Software Developer

    Hewlett-Packard Germany

    (Public Company; HPQ; Computer Hardware industry)

    October 1998September 1999 (1 year )

    1998-99: Object-oriented graphical user interface prototype system service using the model-view-controller software architecture for an embedded mobile patient monitoring system


Christian Engelmann’s Education

  • The University of Reading

    PhD , Computer Science , 20042008

    Thesis title: "Symmetric Active/Active High Availability for High-Performance Computing System Services". Thesis research performed at Oak Ridge National Laboratory. Advisor: Prof. Vassil N. Alexandrov (University of Reading)

  • The University of Reading

    MSc , Computer Science , 20002001

    Thesis title: "Distributed Peer-to-Peer Control for Harness". Thesis research performed at Oak Ridge National Laboratory. Double diploma in conjunction with the Department of Engineering I, Technical College for Engineering and Economics (FHTW) Berlin, Germany. Advisors: Prof. V. N. Alexandrov (University of Reading); George A. (Al) Geist (Oak Ridge National Laboratory).

  • Fachhochschule für Technik und Wirtschaft Berlin

    Dipl.-Ing (FH) , Computer Systems Engineering , 19962001

    Thesis title: "Distributed Peer-to-Peer Control for Harness". Thesis research performed at Oak Ridge National Laboratory. Double diploma in conjunction with the Department of Computer Science, University of Reading, UK. Advisors: Prof. U. Metzler (Technical College for Engineering and Economics (FHTW) Berlin); George A. (Al) Geist (Oak Ridge National Laboratory).


Additional Information

Christian Engelmann’s Websites:

Christian Engelmann’s Interests:

skiing, travel

Christian Engelmann’s Groups:

ACM, ACM SIGOPS, IEEE, IEEE CS, IEEE CS TCSC/TCPP/TCDP/TCFT

  •    The Official IEEE Group
  •    The Official Association for Computing Machinery (ACM) Group
  •    ACM Members
  •    IEEE Computer Society Members
  •    High Performance Computing (HPC).
  •    High Performance & Super Computing
  •    National Laboratory and R&D Institutes Professionals
  •    Oak Ridge National Laboratory
  •    HPC Professionals
  •    University of Reading Alumni
  •    HTW Berlin Alumni
  •    SC09: SuperComputing 2009

Christian Engelmann’s Contact Settings

Interested In:

  • career opportunities
  • job inquiries
  • expertise requests
  • reference requests
  • getting back in touch

Public profile powered by: LinkedIn

Create a public profile: Sign In or Join Now

View Christian Engelmann’s full profile:

  • See who you and Christian Engelmann know in common
  • Get introduced to Christian Engelmann
  • Contact Christian Engelmann directly

View Full Profile