A new version of Mac OSX and a new version of Pentaho Data Integration (aka Kettle) but the same old problem getting Kettle to run. Apple tries to keep their operating system locked down and secure, so if you download applications from the Internet that aren’t from the Apple App Store, the files are quarantined.
With the update to Sierra, the quarantine process has been “improved”. Keep reading to see how to do it!
Continue reading “Running Kettle (Pentaho Data Integration) on Mac OSX 10.12 Sierra”
As I have stated previously when creating ETL workflows, its useful to store the information in a database repository, rather than as individual files on your workstation. This allows multiple users to have access to the information (why recreate the wheel?), it allows you to pull it into your jobs quickly and easily, and you can back it up quickly and restore it if necessary. With the community version 7.0 of PENTAHO® DATA INTEGRATION (PDI), I am happy to report that you can finally create a repository for your ETL code on Microsoft SQL Server. Previously, you could setup a repository on MySQL or PostgreSQL with the community edition but there were compatibility problems with the code that Kettle used that didn’t work with SQL Server. After downloading the latest version I was attempting to make a connection to SQL Server, and decided to test setting up a repository again. I am happy to say it works so the remainder of this article will walk through the process of setting up a Pentaho repository on SQL Server 2016 from a Windows 10 machine.
- Download the jTDS open source SQL Server JDBC driver. Extract the ZIP file, and copy the jtds-1.3.1.jar file from your download and save it into the data-integration\lib folder of your Pentaho application. Although Microsoft provides a JDBC driver, it did not work for me.
- Create an empty database on your Microsoft SQL Server. I created one called “PentahoRepository”
- Setup a SQL Server user account (not an Active Directory account) on your database server and give the account DBO (owner) permissions on the database. Using a DDLADMIN level does not work. I created my account and called it “repository”. I also set the default database for this account to the new database.
Now that we have our prerequisites setup, we can start the PDI client.
Continue reading “Create a Pentaho Kettle Repository on SQL Server”
Recently I switched PCs to a newer Windows 10 based laptop for some of my work, and I wanted to get Pentaho Data Integration up and running on it. I downloaded the pdi-ce-188.8.131.52.25.zip file from the Community website, and extracted the contents to a folder in my Program Files directory. I tried running the SPOON.BAT to start it up but a window flashed on screen quickly and disappeared, but nothing else happened. I opened a command prompt and executed the SPOON.BAT file, but got a message that the JAVAW.EXE file could not be found. So I needed to perform a few other things to get it working.
A quick search engine query showed me that many people had the same issue, but there didn’t seem to be a consensus on how to resolve it. Below is how I managed to get it running.
Continue reading “Running Pentaho Kettle 7 on Windows 10”
I’ve been working with Mondrian and Pentaho’s Schema Workbench lately and attempted to add Meteorite Consulting’s Saiku Analytic plugin to my installation of Pentaho BI Server community edition, to process some MDX queries. MDX is a query language similar to SQL that is used for processing database cubes. Mondrian is a OLAP engine that implements the MDX language and is incorporated into the Saiku Analytic software. It differs from other OLAP engines in that the cubes are built on the fly as the query processes, rather than having the cube data stored on a server. For simpler cubes, the trade off between a slightly slower build time and disk space is negligible.
Here is the process I followed to get Saiku enabled in my BI Server:
Continue reading “Install the Saiku Analytics plugin in Pentaho BIServer CE”