Dr. Mahout: Analyzing clinical data using scalable and distributed computing
Of the few realms cloud computing has not solidly taken root, one in which it has great potential is medicine. Clinicians generate massive amounts of data during the diagnostic process and need an efficient way to analyze it.
For example, the rare genetic disease primary ciliary dyskinesia (PCD) affects the cilia on cells, causing them to behave erratically and leading to breathing problems at best, necessitating lung transplants at worst. Cutting-edge diagnostic tools capture the ciliary motions with high-speed video and use automated methods to quantitatively describe the motion patterns. These methods, however, are compute-intensive and would benefit from parallelization.
Here we propose using the Mahout framework to efficiently learn models that capture the motion patterns observed in the videos and aiding in objective diagnoses. Through this framework, clinicians will need only take biopsies, gather data as images or videos, upload them to a Mahout/Hadoop cluster, and wait for the results. Patient privacy is maintained by perpetuating only the results of the analysis; computational time is reduced by parallelizing the model learning and comparison process; and models are available to clinicians everywhere through the cloud.