ApacheCon NA 2011

Dr. Mahout: Analyzing clinical data using scalable and distributed computing

5:00 - 5:50pm on Thursday, November 10 in Salon B

Of the few realms cloud computing has not solidly taken root, one in which it has great potential is medicine. Clinicians generate massive amounts of data during the diagnostic process and need an efficient way to analyze it.

For example, the rare genetic disease primary ciliary dyskinesia (PCD) affects the cilia on cells, causing them to behave erratically and leading to breathing problems at best, necessitating lung transplants at worst. Cutting-edge diagnostic tools capture the ciliary motions with high-speed video and use automated methods to quantitatively describe the motion patterns. These methods, however, are compute-intensive and would benefit from parallelization.

Here we propose using the Mahout framework to efficiently learn models that capture the motion patterns observed in the videos and aiding in objective diagnoses. Through this framework, clinicians will need only take biopsies, gather data as images or videos, upload them to a Mahout/Hadoop cluster, and wait for the results. Patient privacy is maintained by perpetuating only the results of the analysis; computational time is reduced by parallelizing the model learning and comparison process; and models are available to clinicians everywhere through the cloud.

Platinum Sponsors

Gold Sponsors

Silver Sponsors

Bronze Sponsors

Community Sponsors