August 11th, 2008
Data integration in business environments can be a painful task. I mean REAL painful. The volume of data is huge, it does not cross-validate, it is dispersed in many heterogeneous formats, yadi yada. You know the song. Some day, I stumbled on Pentaho Data Integration (PDI).This was a real breakthrough.
First thing first, it’s not subject to “vendor lock-in”. It can read most data formats out there and can write it back to pretty much anything. This is a huge plus because gives it the ability to be used by a plenitude of user types and environments. Being written in Java also gives it an edge as an enterprise tool, for it is platform agnostic.
But the real advantages are not those trivial specifications. My love for PDI has much deeper roots. Simply put : it’s powerful. Creating an integration process is a trivial matter. Drag and drop. Link. Execute. Those three simple steps will cover most of your business needs. Really, I mean it. Never again will I write a snippet of code to read a CSV file and write it’s content in a database. Mark my words; NEVER! This is a waste of time and a developer who lives with his times should know that.
What about the real juicy stuff ?
As you suspected, there is much more to PDI than meets the eye. It can be clustered, it can use a database based repository for all processes, there are automatic documentation generation tools and is supported by a huge community. Many tutorials exist to address most business needs and challenges. It’s well made, very stable and easily expandable with plugins for power users.
I strongly recommend to give it a try. The next version should be released soon and it will include many great new features. I met Matt Casters last June and had the chance to see for myself all the new functionalities that will make it to the next release. We’re talking about visual performance bottleneck exploration and some more neat stuff you won’t find anywhere else.
Cheers, and have a good time integrating !!
Tags: bi, business intelligence, kettle, matt casters, PDI, pentaho, pentaho data integration
Posted in Technology, bi | No Comments »
July 29th, 2008
I successfully added the iPhone extension to my Pentaho platform today and I was more then impressed with the ease with which we can enable the whole platform to work seamlessly on those nifty little phones.
Oh yeah, I bought an iPhone too…
I’m slowly discovering the fun of having a cellular phone in my pocket. This is something that I never experienced before; never had a cell phone. I have to say that I’m glad it’s a good phone, and sexy too.
The bottom line is : get one.
For those interested, here’s the wiki page that says it all. Thanks to Will Gorman, senior developer at Pentaho, who put this all up.
Tags: bi, business intelligence, iphone, pentaho
Posted in Technology, Uncategorized, bi | No Comments »
July 21st, 2008
There is this old saying which goes like :
“Linux is safe enough to keep it vanilla. Anything you add weakens it’s security.”
Okay, this is not an actual popular saying, but since most Linux server I saw in my career were configured in conformity to this piece of wisdom (sic), I decided to share some experience with basic and mandatory security measures to add on a Linux server… I’m just sooo tired of fixing broken servers that have been hacked.
There is a simple suite of programs to install and you’ll be at the very least secured against kiddies and the like. Here it goes.
Securing the OS
Most of the time, the piece of software that was hacked was the OS itself. Not because there are awful flaws in Linux (or just any OS as a matter of fact), but because simple rules were not respected. How many of you who have configured servers can certify that they are protected against brute force attacks ? How many are protected against DoS attacks ? Linux, nor any other OS I’ve seen so far (correct me as you wish…) don’t come with DoS or BF detection. Having a secured SSH access is mandatory these days, but what’s the point of setting passwords when a simple brute force attack will break it.
Here are some solutions. The Advanced Policy Firewall (APF) is a simple Linux firewall that uses the iptables utility to create firewall rules on your system. Why APF and not iptables alone ? Because it integrated with a DoS detection tool and Brute Force Detection (BFD). The DoS tool will detect any Denial of Service attacks while BFD will monitor incoming connections and ban any IP who breaks easy to setup access throttling rules All these tools are free and compatible with most Linux flavors. Try em out! There are many more available from R-fx Networks, the company that maintains them.
As for the setup instructions, google for them; as always. There are many nifty tutorials out there and I won’t copy them here
Web Applications Security
What if I told you that there is a generic way of applying a minimum security level to all your web applications at the OS level, thus simplifying the life of anyone who administers web servers. You might get frustrated by the fact that you didn’t know this at the time you got hacked. You might even wonder how wonderful this would be for your web hosting server.
Well, I’m doing it.
I’ll say it.
Ready ? There it is.
ModSecurity
Okay, this was the hard part. Now it will be much easier. It’s a simple Apache HTTPD module that you add to your web server configuration and it will validate all requests against a set of nifty threat detectors. It uses regular expression to protect your applications against overflows, injections and whatever might be dangerous for them.
There is even a console available to monitor many installations and keep an eye out for alerts.
Easy to understand, easy to install. As always, google has all your answers.
The bottom line
The lesson to remember is that these tools take half a day of work to setup and they will save you sooo much trouble in the future that it is worthless to discuss the pertinence of using them. The tools are out there, for free. You’d be a fool not to use them.
CQFD
Tags: apache, apf, bfd, httpd, linux, modsecurity, security, Servers, ssh, web
Posted in Servers, Technology | No Comments »
July 15th, 2008
I experimented a few months ago with Comic Life, a nifty little application which lets you create comic strips or whole comic books in a snap. It’s bundled with Apple Mac OS X and work like a charm. Take a look for yourself ! Better yet, try it out. It came free with my OS X install, but now they ask 30 bucks for it. It’s working with Mac and Windows. This is pretty cheap for something that works so well.

Tags: apple, comic, experiment, os x
Posted in Experiments | No Comments »
July 15th, 2008
I’ve been working for a month now on some enhancements to the Olap4j project to make it more powerful and compatible. The good news is, I succeeded. The previous version, 0.9.5, lacked some basic functionalities which you would expect from a production ready XML/A driver.
For one, it’s HTTP proxy didn’t support cookies. his was a big problem since the myriad of requests required to populate Olap4j’s meta data objects each created a new user session on the web service back-end. This was a no-no, but now it’s fixed and kicking ass.
I also worked on a SOAP query cache. This is was a big piece of software engineering, since I’m not used to thread safe coding. Thread safe thignys are usually in the lower levels of BI application servers and those issues are tackled from the start. Thanks to Java’s java.util.concurrent package, this was a breeze.
Those changes are not part of any release nor in the SVN yet. I’m still waiting for peer review before the whole commit, but for people eager to see what it looks like, I’ve created a neat little package for y’all.
Now I can move back to my next release of the University of Montreal’s Pentaho platform… all work and no play makes Luc a dull boy.
Cheers !
Tags: api, bi, business intelligence, java, olap, olap4j
Posted in Technology, bi | No Comments »