Blog

All Aboard! The R Service Bus 6.2

As R has continued its growth in populary, it’s made some exotic friends. Friends who speak other (programming) languages. Friends who live on servers and virtual machines. Friends who sometimes need to set aside their differences and work towards a common goal. In the absence of a protocol droid or Babel fish, we have the enterprise service bus. For the uninitiated, an enterprise service bus is a software architecture model designed to interface between various software applications.

Continue reading

Need for Processing Speed: data.table

The first time I discovered data.table it felt like magic. I was waiting on a process that was projected to take the better part of an afternoon. In the meantime, I followed the data.table tutorial, rewrote my code using the data.table structure, and fully executed said code, all while the data.frame equivalent was wheezing along. In the last year, data.table has gotten even faster. data.table’s Automatic Indexing For the uninitiated, data.

Continue reading

Hypothesis Testing: Fishing for Trouble

Introduction “Can you check if this is significant?” It was a seemingly innocuous question from a dangerous source: a semi data-literate scientist. The kind who believed, deep in his heart, that small p-values were “good” and large p-values were “erroneous”. On this day, the man in question had come forth with a large, complex multivariate dataset. He’d manually combed the data, visually inspected it, and hand-picked a hypothesis. “Can you check if this is significant?

Continue reading