A few weeks ago we had a little Twitter banter going with @N3Roaster about data and data ownership.  It’s a very good question and one that’s come up before. so I thought I’d spell out the stance we’re planning on taking is going to be like.  This a big fat disclaimer, nothing I write here is legally binding.  We’re in the process of getting our user agreement formalized, which will be the legally binding document.  But, what we’re telling him is this:

We are not in the business of owning your data.  When you use RoastLog, we will ensure your data is backed up and secured, and  give you the tools to analyze it.  We do not own your data…you own your data.  If at any time you want a raw dump of all your roasting data, we’ll do that for you.  RoastLog is going to be so powerful that taking a raw dump and loading it into Excel would be a giant leap backwards, but if you wanted to do that for some odd reason, you could.

Everyone we’ve talked to so far has actually loved the idea about having their data in a central place due to the fact that it makes sharing easy….or even possible.  But, of course, there are some real concerns about shooting years worth of roasting history into “the cloud”.  It’s very important for us to clearly state that just because we’re making your data available and easily accessible doesn’t mean it’s all of  sudden ours.  The software tools we’re building are just that, tools which allow you to analyze your data in ways which are hard or impossible now.  We’re not in the business of building a data warehouse that we’re going to sell off to some 3rd party.

Categories: Questions

4 Comments

Neal Wilson · March 3, 2010 at 11:36 am

This is @N3Roaster from Twitter. Have you ever had a conversation where it doesn’t seem like the participants are exactly on the same wavelength? Perhaps this is due to the limitations of Twitter as a medium for meaningful discussion, but the above mentioned banter was not actually about data ownership. Now, that’s an important issue and I’m very glad to see the RoastLog position, but it masks an even more important issue. The issue is complex and I apologize in advance for the length of this reply.

Spreadsheets basically suck for this sort of data. Sure, it works, and so do paper records, and a lot of roasters are getting along just fine that way, but those of us who have already moved on to software specialized for the needs of roasting professionals understand that this solves a number of problems that come up when trying to actually use the data. If the data cannot be conveniently recalled, aggregated, processed, and manipulated, the utility of that data is severely limited.

The main advantage of software such as this is that it allows for the creation of a data-aware workflow. By making it easy to record and recall process data, certain types of roasting errors can be eliminated and there are remarkable productivity gains to be had. For example, the database at my shop (and I think I recall reading that RoastLog does this as well) tracks coffee inventory among other things. By recording purchase and use information as part of the normal workflow, it becomes possible to pull up useful information very quickly when it is needed. Determining the value of green coffee as of a given date, tracking trends in coffee sales, projecting how long a given lot of coffee will last or informing decisions with regard to how much of a given coffee to purchase become quick and easy compared to hunting through paper records or keeping the equivalent spreadsheet current. Workflow integrated data systems are likely to present substantial benefits for all but the tiniest of coffee roasting firms.

However, all of this hinges on one key point, and this is not the matter of data ownership, but of data availability. A raw data dump that cannot be used with tools for making sense of that data is nearly worthless. In the words of the above entry, “a giant leap backwards.” So, what happens when you build your workflow around something in the cloud? Suddenly, you are exposed to risks beyond your control. If your ISP is having an area outage, you can’t get at your data. If (heaven forbid) RoastLog fails as a business and can no longer afford to keep the servers running, you can’t get at your data. A business considering this must manage this risk, either by not taking full advantage of the benefits of a system such as this, or by accepting that operational continuity will be in jeopardy when that data is no longer available.

There is one action that the RoastLog people could take to assuage these concerns. That is, to allow a company to maintain their own server which the RoastLog software could use in a manner functionally identical to the central server. It is only in this way that a company’s data is its own in a fully meaningful sense. There is no technical reason why keeping this data with a third party organization is required for easy information sharing or powerful data analysis.

I hope that I have made clear the concerns that a business must consider when deciding what sort of data systems to invest in and I hope the RoastLog people consider these serious concerns as well. As I mentioned on Twitter, I wish RoastLog the best of luck and am always glad to see new options for roasting companies considering this type of software.

brianz · March 3, 2010 at 6:20 pm

Neal, thanks for the comment and encouraging words….we hope we’re successful too and that RoastLog, at the end of the day, makes life easier for roasters around the world. And yes, trying to have a conversation 140 characters at a time is challenging, to say the least.

I think I understand where you’re coming from now. On this topic, I think the important thing to keep in mind here is the vast majority of people don’t have the skills, experience or desire to keep their own servers online, with replicated databases distributed across different geographical locations, systems monitoring, feature upgrades and the hundreds of other things it takes to run a web service running. Even if someone did do all that, is your data really “safe”? What happens if your shop gets robbed, there’s a fire or your disk goes south and you haven’t been backing up?

Nowadays, who would opt to run their own mailserver when they can simply sign up for one of the millions of free email services out there? Keeping data stored online may not be for everyone, and if that’s the case then they probably shouldn’t use RoastLog…or gmail, or online banking, or….on and on. Nowadays, we delegate all of the hard work of data protection and integrity to the folks who know how to do it. Sure, gmail has gone down from time-to-time, but it’s way more reliable and worry-free than if I ran my own mailserver.

We just so happen to do this stuff for a living, so we’re going to do the best possible job of making sure you data is backed up, secured and available. If we don’t, then we’d go out of business…and nobody would win.

Neal Wilson · March 4, 2010 at 10:06 am

I am a bit disappointed with this obviously satirical response, however lest some mistake it as serious, I will respond. While I would certainly agree that the vast majority of people don’t have the skills, experience, or desire to keep running replicated databases distributed across different geographical locations, systems monitoring, and the hundreds of other things it takes to keep a web service running, it is important to keep in mind that the vast majority of organizations do not need anything nearly that complex architecturally. You were at the SCAA conference last year. Did you attend the session in which the data system in place at Green Mountain Coffee was described? Now, the overlap between this custom job and RoastLog is not perfect and at the time they hadn’t gone very far in integrating that data system into workflows, however they had put in rather a lot of data and were using that for interesting analysis. It is certainly one of the larger data sets maintained by any one roasting firm in the specialty coffee industry. Do you recall what they were using to store all of this data? It was Microsoft Access. Now, I’m sure you’re aware of the limitations of such an approach, and it certainly wouldn’t be what I would recommend, but they’ve clearly gotten considerable use from it. Something like Access, or better yet, PostgreSQL is, for the vast majority of use cases, very simple to install and keep running.

I don’t think anybody will argue against the importance of backups. Computer hardware does, after all, eventually fail. Fortunately, the relevant records are very small. To take my own data set as an example, I have inventory data going back to 2000, cupping data going back to 2003, roasting data going back to 2005, along with other information and a dump of the database easily fits on a single DVD-R even without compression, and this is data that compresses very well. The difficulty in keeping this sort of data backed up is no greater than the difficulty of keeping your accounting data backed up.

The comparison to email is apt. With email, a small number of well documented and well understood protocols allow all of the systems involved to communicate. These protocols are implemented by several competing programs with good interoperability. There are countless companies that can manage those servers, and if you don’t like the one you have, the barrier to switching to another one is very low, particularly for a business that has its own domain.

The comparison to online banking is absurd. How many instances of QuickBooks, PeachTree, or MYOB have been displaced by online banking? This just isn’t happening. The trend is rather to integrate with online services while still maintaining local storage of that financial data.

I understand that you have an interest in making this all seem very complex and beyond the ability of coffee firms to manage, but this is just spreading fear, uncertainty, and doubt.

brianz · March 4, 2010 at 10:27 am

Hi Neal,

I can promise you that my response was not meant to be satirical in any way shape or form.

I think we just have a difference of opinion and we can agree to disagree. So far, the folks we’ve talked to all really love the notion of having data stored somewhere else since it provides the ability to share data. In a closed system, that isn’t easy or possible.

It sounds like you’ve built a pretty amazing system which works for you…kudos. Our approach to this problem is different, and the goal is to make it easy, useful and powerful for folks who have no idea was PostgreSQL is, let alone how to install and configure it.

We never spread fear or anything else negative. We simply tell people what we have and advocate on its behalf since there *are* real benefits. Our interest is in making a great product which is useful for the coffee community.

Comments are closed.