Interesting Docker Containers (and some tips on running them)

shipping-82339I’ve been learning how to use Docker for the last couple of months. Part of that experience has been downloading and working with various freely available containers from the Docker Hub. Since an ever increasing number applications are web based (i.e using a website as a UI tool) porting many open source projects to use Docker is becoming less complex. While not all open source applications can benefit yet from containerization, I think its only a matter of time before this technology starts to really gain in popularity.
My area of interest tends to fall more towards Business Intelligence and related technologies, so what I’ve been experimenting with are containers in that realm. This time around, I’ll discuss these containers and my experience getting them running on a native Docker installation on Ubuntu:
  • Rstudio
  • Cloudera Hadoop
  • ODOO with Postgresql

Continue reading “Interesting Docker Containers (and some tips on running them)”

Install the Saiku Analytics plugin in Pentaho BIServer CE


I’ve been working with Mondrian and Pentaho’s Schema Workbench lately and attempted to add Meteorite Consulting’s Saiku Analytic plugin to my installation of Pentaho BI Server community edition, to process some MDX queries. MDX is a query language similar to SQL that is used for processing database cubes. Mondrian is a OLAP engine that implements the MDX language and is incorporated into the Saiku Analytic software. It differs from other OLAP engines in that the cubes are built on the fly as the query processes, rather than having the cube data stored on a server. For simpler cubes, the trade off between a slightly slower build time and disk space is negligible.

Here is the process I followed to get Saiku enabled in my BI Server:

Continue reading “Install the Saiku Analytics plugin in Pentaho BIServer CE”

Cassandra Installation on Linux Mint

Cassandra_logoNoSQL databases are multiplying faster that anyone could have imagined when Google released their Google File System and MapReduce papers in 2003 and 2004 – the inspiration for a host of NoSQL database developers.

The NoSQL project that became Apache Cassandra was originally developed at Facebook as the back-end for their Inbox Search feature. It was released as open source in July 2008 and later moved to the Apache Foundation. Gaining in popularity in the intervening time,  the website now ranks Cassandra the eighth most popular data management system.

Setting up a development version of Cassandra on Linux Mint is pretty straightforward, and this tutorial will walk through the process. At the time of this writing, Cassandra 3.0.1 had just been released. It requires Java 8 or later and at least 8GB of RAM. Version 2.2.4 is available as the most current stable version, which will run with Java 6 or later with 4GB or RAM. If you intend to use DataStax’s OpsCenter management tool be aware that it does not currently support either version.

To illustrate the installation of OpsCenter I will be covering the older Cassandra version 2.1 which IS supported by OpsCenter. DataStax notes on their website that the next version of OpsCenter will support Cassandra 2.2 and 3.0. The difference between installing version 2.2 or 3.0 is minimal and I’ll note that as we process. Also, be aware that the Linux Mint update center may try to upgrade your system to 3.0 once you are complete.

Continue reading “Cassandra Installation on Linux Mint”

Interesting stuff from around the Internet

internet-300pxBeen a crazy busy month personally, lost one family member, added a new one, celebrated some stuff, and Halloween was here. I’ve been looking into a variety of information on the web so here is a round up of some interesting topics until I have some time to devote to normal articles and tutorials.

RethinkDB – this newer Big Data database platform seems to be getting some traction. Its aimed at realtime web applications, and uses a JSON type data structure (similar to MongoDB), but also provides support for JOINs (which MongoDB doesn’t). PacktPub’s blog posted an article on Learning RethinkDB. The documentation on the platform’s website is very good compared to what is normally associated with Open Source projects. Continue reading “Interesting stuff from around the Internet”