Software Engineering Research Intern (Summer 2017)


Our engineering team

The Flatiron Health engineering team builds and runs data processing pipelines, algorithmic and human-operated data curation tools, and customer-facing data analytics and visualization tools. We build systems that clean, structure and understand clinical-level patient data as we build the largest, most comprehensive set of oncology data intelligence. We scale pipelines to handle the world's oncology data with creative engineering solutions to open-ended oncology data problems. Read about our engineering culture!  

Our data

Our data pipeline takes in full clinical patient records and produces normalized, high-fidelity, queryable data that is ready to be mined for medical and operational insights. We’ve built a unified data set that is unparalleled in its depth and accuracy, and gives our partners the capability to ask questions about data that weren't answerable before now.

Our culture

We work in a fast-paced, high-information-volume environment with complex domain challenges, which means context is key and there's always more to learn and soak up. We want to know oncology data cold — better than anyone else in the industry — and we believe doing that requires building a culture where we all like coming to work each day. In our culture, decisions are transparent and data-driven, and people are empowered to make waves.

Are you interested in changing the way the oncology world thinks about data?


As a Flatiron Software Engineering Research Intern, you will:

  • Explore algorithmic and statistical techniques using our data to improve products and to scale more quickly
  • Run experiments and implement prototypes that will inform key product development decisions
  • Work cross-functionally to incorporate ideas and expertise from our engineering, statistics, and clinical teams


About you:

  • You are in a computer science, medical informatics, statistics, or related degree program
  • You have advanced coursework, research, or industry experience in machine learning, natural language processing, or medical informatics
  • You understand experimental design and can build for collection, measurement and interpretation of results
  • You value impact, are results-oriented, and care about the details
  • You are comfortable building and working with data pipelines in languages such as Python, R, Java, C/C++, or Scala
  • You are enthusiastic about working on a multi-disciplinary team
  • You are passionate about our mission to improve healthcare through data and software


Bonus points:

  • You are in a Ph.D. program focusing on machine learning, natural language processing, or medical informatics
  • You have experience working with clinical data
  • You have experience building production machine learning systems
  • You have experience with Python, numpy, scikit-learn, and R