I have a good friend, let's call him Charlie, that is currently doing his PhD in some sort of computational science. Like numerous researchers, Charlie uses IPython Notebooks everyday to create scripts, write findings, iterate on ideas, run simulations, etc.
Charlie told me today about how his computer was letting him down. Apparently, he was running 6 IPython simulations simultaneously and that was draining his battery... while the laptop was plugged to the wall. Whether this is a hardware failure, or really his code is burning energy at a faster pace than the charger can refuel, is not really my concern. What sparked my interest was that Charlie, having a strong computer science background, living in Silicon Valley, and so on was using his personal laptop for what it sounds to me like heavy simulations.
I told him, that what he needs is to run his Notebooks in a hosted service, and then proceeded to explain to him what I think could be a very valuable service. Imagine web service that hosts IPython Notebooks for you. Say, pay $X/Mo and you get Y notebooks with predetermined computing power. That sparked his interest. He would no longer be constrained to the chargers energy refueling rate, he could probably run more notebooks at the same time, heck he could even let long-running notebooks overnight. It seemed like a no-brainer to him. And then, he mentioned something that sparked my interest. Apparently, his advisor gave him free rein and a budget bounded only by common sense to use AWS resources. Why hasn't he? Well, he thinks AWS is too complicated, and not entirely easy to set up a good working environment. I proceeded to explain what I thought was the natural next step of my imaginary service.
Let's say that a user of Ramon's Notebooks (the name of the hosted IPython service) is already paying his monthly $X fee. He then realizes that he needs more power. Luckily, his recent acquisition of a juicy NSF grant gives him enough purchasing power to look at serious IaaS providers, but he has never used any, nor does he has the time to learn how to use them correctly. More importantly, setting up and implementing a scalable IaaS service is not exactly a valued exercise in today's academic environment. As he ponders on what his options are, he realizes Ramon's Notebooks's new service: boot up computing clusters at will from your own hosted Notebook, and get charged for what you use. He looks at the documentation and it not only easy, but it also requires nothing more than learn a tiny new Python library, that would he would run from his existing notebooks. It is a no brainer.
But why stop there? There is already widespread feeling that research should be reproducible, so why not be able to share these notebooks? If Charlie has a notebook that I like, I should be able to run from my own account and my own expense. Maybe even comment on it, and add insights that Charlie might have bypassed. Often, however, these Notebooks would need access to specific datasets. Well, the service could also provide a "upload to DB" sort of functionality and would give you a URL to access your data from Pandas.
A service like this is not only doable, but already in the direction of that Jupyter (the new name for the Notebooks) is going. The community has already built JupyterHub which enables multi-user hosted Jupyter notebooks. MIT created StarCluster, a Python library for handling EC2 instances from IPython. Payments can be handled by Stripe and login by GitHub or other third party auth system. The hardest part would be to charge each user based on the resources it has consumed, which would be process of tagging each AWS resource with a user. I imagine there is work done on that.
Charlie, the future is here.