Archive for the ‘bi’ Category

Pentaho’s Road To Profitability - Take 2

Tuesday, June 29th, 2010

Tom Barber, a collaborator of mine, one of the most active community member of the Pentaho project, posted a very interesting blog entry this week. He is looking back at the path of Pentaho Corporation, a commercial open source business intelligence company, and their latest strategies for growth. He notes that in the past months, Pentaho has put a lot of effort in marketing initiatives and hired many big wigs in their marketing staff. In his opinion, this is somewhat against the ideals of commercial OSS development.

The commercial open source business model is still very young and has not encountered many great challenges up to now. Nor has it sailed in very troubled waters. Yet some signs are already warning people to be very careful with the years to come. Sun Microsystems (now Oracle) was surviving thanks to donations. Compiere recently made the news for all the bad reasons. Red Hat is doing pretty well though. In a nutshell, anything is still possible; good or bad.

Tom’s wish was that Pentaho would rather focus on paying skilled engineers rather than sales people. Now, as much as somewhat agree with the general idea, one must keep in mind that there is no correlation between the number of talented people getting paid to be on a project and it’s success. Some notoriously successful projects depend almost entirely on it’s community base. Mozilla Corporation to name only one. Others employ thousands of employees, yet achieve mediocre results. One could also argue that an effective marketing campaign will in fact boost people’s awareness, thus getting more talented people to join the community base. There is no tested and fail proof recipe so far. As I said earlier, everything goes.

I worked for the past months for a commercial open source company, and the same questions and uncertainties were part of every day discussions. How can a software company making no revenues on licensing be profitable? SQL Power Group, my employer, sponsors the only open source data modeling tool that works cross-platforms and offers, for free, the majority of the functionality included in widely known proprietary tools like ErWin. There are on a weekly basis between five hundred and a thousand downloads of SQL Power Architect. This is a lot for an OSS project in such a narrow niche. Yet a very small proportion of those are actually paying for support/consultancy, or making donations.

There are a lot of factors in play. First off, people willingly using OSS have overall better technical skills than others. To be fair, not every OSS is of acceptable quality, but you get to try as many as you want. Proprietary software is not better in any way, but you won’t get to try many of them, short of having very good contacts and/or deep pockets. This is one reason why reaching the tipping point of adoption is very hard for an OSS company. The percentage of OSS users ready to pay instead of figuring things by themselves is low. Way low. Waaayyy loooww. If you paid for software, you want support (and vice versa). You feel obliged to use the software because you already invested much in it. (Yes, money is time, time is money.) If you didn’t pay for the software and you hit a snag, you are far more likely to stop using it altogether. By abandoning it, you feel like you are exercising damage control, since you save time (thus money). You didn’t pay so why invest time? I’m fully aware that once you do the math, you realize that it’s the exact opposite. People usually don’t.

What does all that have to do with Tom? Well first thing first. Starting next week, I’ll be an employee of Pentaho Corporation. Yup. Am I a big wig? Nope. Am I a marketing guy? Well, I’m good looking, but not good enough with oxymoron. Am I a marketer? I’m having trouble making this blog post interesting, so nope, not a marketer either. I’m just some skilled dude who got his hands dirty for years and now decided that he would devote some serious time to make things better.

Look at it this way. In Tom’s vision of Pentaho, my ass is on the line. I deeply respect his insights on the market and software development in general. Yet here I am, proud as a peacock on a Sunday brunch, about to be part of this fantastic project. Am I worried? Hell no. I’m very enthusiastic about Pentaho having enough budget for marketing initiatives. This is exactly what OSS companies need; market awareness. People need to know you exist and that you are just as good as all those Fortune 100 companies. Even better than these. It has nothing to do with being a sell out. Marketing is not bad for your grassroots values, unless you make it so. Then again, you have only yourself to blame.

OSS companies need to reach the critical mass of contributors and adoption needed to survive. There is simply no other way. OSS users are picky, grumpy, bitchy and unforgiving. I know that for a fact. You want to thrive in that market share? You need two things. Proper marketing and skilled engineers. I’ll be doing my part in the later, and I’m very confident that the good people at Pentaho have picked skilled people to cover the former.

We are competing against giants, so let’s give them a ride for their money.

Monkey Business

Wednesday, May 5th, 2010

Last week we worked on a guerilla-marketing video for Wabit. I’ll let you be the judge of that.

mdx4j - MDX query language parser

Monday, April 12th, 2010

Last week I launched a spin off of the olap4j parser. Mdx4j wraps olap4j’s MDX parser and makes it available to code, without the need of an olap connection.

Why?

Although olap4j contains a SPI parser, we don’t want to promote any particular MDX syntax. I therefore packaged it as a separate project so that everyone can have a piece of the pie!

final String query =
“SELECT{} ON COLUMNS FROM CUBE”;

final MdxParser parser =
Mdx4jParserFactory.createMdxParser();

final ParseTreeNode tree =
parser.parseSelect(query);

http://code.google.com/p/mdx4j/

All in one BI tool for the non-geeks

Wednesday, September 30th, 2009

A colleague of mine once asked me if I knew a program that can connect to almost all relational databases and offers MsAccess like features to build queries. Sure thing says I. Wabit.

So he downloads it and installs it in 5 minutes. It’s free and open source. No hassle. He then creates his connections and manages to do everything he needs to fulfil his duties as a business analyst. Pretty kewl story, heh? Short too. But that’s a good sign because as a developer on this project, I can confirm first hand that this is exactly what we aimed for. Making business intelligence easy and painless.

The Wabit is more than that. It’s also an OLAP data warehouse browsing and reporting tool. It creates charts in 10 seconds and features a template engine for easy corporate branding. Version 1.0 will feature a server repository for multi-user collaboration and incremental saves, scheduling and fine grained security. The enterprise server is not open sourced though, but the Wabit client is a fully featured platform. You can still save all your queries and reports as an XML file for easy import and export and share it with your fellow co-workers.

The Wabit approaches 1.0 now. We need to reinforce the community around it and we need more feedback. The Wabit works on all platforms with a Java JVM, so whatever your background is, I’m sure that we can make good use of your comments or contributions. You are a GUI designer or a BI consultant or even just the regular Java developer, we have need of your help.

Wabit on Google Code
Wabit homepage

(more…)

Olap4j vs. Oracle and Ruby

Wednesday, September 9th, 2009

During my monthly checkup of this blog analytics data (thank you Google Analytics), I discovered a new trend. More and more, people are searching for information on olap4j’s compatibility.Here are the interesting keywords used and the number of occurrences for the last month.

  • “olap4j ruby” - 28 occurrences
  • “olap4j oracle” - 3 occurrences

Oracle; I can understand. Olap4j is picking up momentum and is more widely adopted. We support both Microsoft Analysis Services and Mondrian via XMLA. Oracle does have an XMLA server, Hyperion Essbase, although we never tested it with olap4j. If one of you reading this post happens to be a Oracle wizard, please contact us so we can have a chat. The more OLAP servers we support the better.Ruby; now that’s intriguing. Ruby can run in a JVM thanks to the JRuby project. Would olap4j work well with JRuby? probably. Are there are any OLAP API for Ruby? Google says no. Digging further in the analytics data didn’t reveal to me the actual intent of those who are searching for “olap4j ruby” keywords. What a mystery… I therefore send out there a general call to anyone interested in using olap4j inside JRuby, for we might have common interests.

Creating Mondrian Schemas with Power*Architect

Friday, August 28th, 2009

Since I don’t have time to write much software myself these days, I figured I’d share this gem with you all. SQL Power, the Canadian Business Intelligence Authority (that’s their tag line these days…), sponsors many open source projects. One of them is called Power*Architect; a marvellous cross-platform data modelling tool.

As far as I know, there are close to none “enterprise ready” data modelling tools that work on Linux and Mac. I also suspect none are free, whatever the platform. Visio is certainly not one of them.

Why is it so wonderful? Well, to start with, it can retro/forward engineer most JDBC compatible databases. That’s a big plus. And it gets better. You can also use it to create a Mondrian schema. Yep. The team at SQL Power published a tutorial for that last week.

I do have to disclose that I will be working on their projects starting in October. I’m not trying to sell it to you; it’s free anyways.  One thing is for sure though. I can’t wait to get my hands in there. So I encourage everyone to grab a copy here and fill as many bug reports as you can. It’s not 1.0 yet, so community contributions are a must. Having worked with the team for three weeks back in July, I can guarantee that each and every reported bug and suggested feature is closely studied by the development team.

olap4j - A comprehensive tutorial

Thursday, August 20th, 2009

I’ve been very busy lately with the new job comming up and many other changes in my personal life, but fear not; I’m cooking something up for you people. I’m working on a comprehensive guide to olap4j. Many people have expressed a need for a more step-by-step introduction on olap4j, what it is, and how to unleash it’s raw power. In the next few weeks, I should be able to finally put some more time on it and release a first final draft. Until then, take care!

Connect Microsoft SQL Server from olap4j

Thursday, May 21st, 2009

Browsing my Google Analytics statistics, I realized there is a lot of people out there that are searching for ways to connect Microsoft SQL Server with olap4j.

Here is a nice example.
// We must use the XMLA driver.
Class.forName("org.olap4j.driver.xmla.XmlaOlap4jDriver");

// This code is for Java 5. With Java 6, you can directly
// unwrap the underlying connection with the .unwrap() call.
OlapConnection connection =
(OlapConnection) DriverManager.getConnection(

// This is the SQL Server service end point.
"jdbc:xmla:Server=http://example.com/olap/msmdpump.dll"

// Tells the XMLA driver to use a SOAP request cache layer.
// We will use an in-memory static cache.
+ ";Cache=org.olap4j.driver.xmla.cache.XmlaOlap4jNamedMemoryCache"

// Sets the cache name to use. This allows cross-connection
// cache sharing. Don't give the driver a cache name and it
// disables sharing.
+ ";Cache.Name=MyNiftyConnection"

// Some cache performance tweaks.
// Look at the javadoc for details.
+ ";Cache.Mode=LFU;Cache.Timeout=600;Cache.Size=100",

// XMLA is over HTTP, so BASIC authentication is used.
"username",
"password" );

// We can execute a query. MDX of course.
CellSet set = connection.createStatement().executeOlapQuery(
"SELECT {} ON COLUMNS FROM CUBE");

Update : Some useful links

PAT 0.3 - Integration and nifty features…

Wednesday, May 20th, 2009

Pentaho Analysis Tool sprint 2 is now over and released since a few weeks. We’re currently in the process of wrapping up the third sprint. What’s new? What new features will there be?

Nothing very spectacular really. Yet there are a few nice features worth mentioning.

  • Integration with the Pentaho BI server
    PAT can now run embedded in the Pentaho User Console and be configured remotely. It only supports XMLA connections for this first draft, but don’t worry; more compatible connection types are to be supported in the medium term. This also means that you can seamlessly use any XMLA provider. Sql Server? Essbase? Mondrian? You choose. All that thanks to olap4j. If you want specific details on this, Gretchen Moran has written a bunch of documentation on this. Congratulations to Paul Stoellberger and Gretchen Moran for this feature.
  • Create multiple queries at once
    This was a requirement that was passed to us by Pentaho’s engineering team. People want to build and use more than one query at a time without concurrency issues. That was not properly supported by the age old JPivot application. Now and then we encountered some problems with multiple queries, so this was something pretty high on our features list. The backend supported that since sprint 1 , but there was higher priorities for the GUI components.
  • OlapTableModel first draft
    The Java API doesn’t include a proper TableModel for OLAP data, so we’re planning to write one. We still have lots of things to figure out on this subject, but we’re planning to mock a draft specification before the sprint 3 deadline.

Our initial intent for the third sprint was to make the GUI nicer, but we’re still trying to figure out if XUL would be a better generation tool than SWT. We are waiting for developments between the GWT-Mosaic team and Nick Baker of Pentaho and hope that all those nice gents would team up and cook something up to their talent. Fingers crossed here. :)

Also worth mentioning,  we started talks to review the Query model currently residing in olap4j. This is a soon to be major component as it will provide developers and GUI enthusiasts with a properly abstracted API to build queries against a data warehouse. Olap4j is  very great API for low level stuff, but it still needs an abstraction layer for the common folks. Building queries should not require in-depth knowledge of MDX, for you cannot expect all business analysts to master MDX anyways. Anyone who wants to participate in the process is warmly invited to manifest himself. We are really looking for input on this, whatever your background is. (As long as it’s related to BI I guess…)

Of OLAP and the importance of open standards

Tuesday, November 11th, 2008

In these times of economical crisis, many companies will turn to business intelligence (BI) as a source of wisdom and counsel. Millions of dollars will be invested in an effort to understand the extend of their respective problems and find solutions based on accurate and decision oriented datasets.

Since I have a fairly good amount of experience with work in heterogeneous environments and tackling data integration challenges, I thought I’d pitch in my two cents.

Why developers and project managers will have a hard time

The root of the problem is this. The Microsoft OLAP toolkit does not integrate so well with anything else than .NET technologies. SAS offers a Java API, yet it is not ready for production. (I worked with it for two years, and believe me, they are still a fairly long way to production quality code.) As a matter of fact, most software vendors in the OLAP world distribute some API to integrate their technologies, but you often end up with black boxes of questionable quality, flexibility or performance. Some even go as far as to obfuscate their libraries… this really doesn’t help in the end.

Some vendors like Oracle went for the all-in-the-box solution. They offer a “complete” solution that can fit every possible need. Then again, what they are telling you is: if we don’t have it, you probably don’t need it.”Probably”? You got to be kidding. Since when does software vendors know what you need and what your future will be? Better switch probably for hopefully.

In the best case, in order to meet your needs, you’ll hack your way through at the expense of your project specifications. The final result can be nothing but deceiving. Your celebration will be bitter and probably short lived, I fear.

About the importance of collective work

You have a brand new application. Hooray! This is where the production phase kicks in.

What if you need to move your datamart to another OLAP server? What if there are not enough connections licenses to allow both production connections and all of your maintenance personnel on the OLAP server and they are forced to take turns to debug? What if the CEO decides to migrate to a new platform? What if [insert random but oh so frequent unforeseen event here]? Your thousand dollar code is now rendered useless; you can start crying now, you deserve it. In your quest for more money making, you’ve created a monster that was expensive and will continue to pump the money out of your institution pockets.

If you were good enough in systems design, you thought about a data layer. The data layer still remains to be rewritten entirely and it often represents at least a third of the overall effort. Close, but no cigar. This might sound like a catastrophic scenario, but it is oh so frequent.

Many people got tired of all this non-sense we decided to work together. We decided that enough time and money was wasted on individual efforts that were ruined in the end.  It was time to agree on standards and share the product of our collective effort.

Take Hibernate for example. It is now a de facto standard when it comes to data mappers. For the Java version alone, it represents 859 thousand lines of code worth 12.8 million dollars in work hours. Think you can top that with your in-house data layer in times of economical crisis?

About Java OLAP

OLAP is a world in itself. You can’t take relational paradigms and apply them to the multidimensional world. The .NET toolbox does have very nice libraries to do some neat OLAP stuff, then again, you’re locked-in with SSAS. This is a no-no.

On the Java side, things are even worse. There is currently a big void in the Java OLAP market. No OLAP standard emerged at all. Thanks to the selfishness of the big players of the industry, the JOLAP initiative was a total failure. It never reached the final version, so the JSR-69 specification died quietly.

We at Olap4j tried to fill that gap with an open initiative. Everyone can pitch in. And I mean EVERYONE.

What makes Olap4j so kewl

You know the expression vendor lock-in? I hope you do, I *really* do, or else you’ll learn it the hard way. Olap4j aims at solving exactly this problem. You can develop applications on it’s API and switch the underlying OLAP engine without rewriting a single line of code. Not bad heh? Olap4j is more than a database driver. It is an open API built right on top of the JDBC industry standard where everyone collaborates to specify a common base onto which to build.

It even includes transformation libraries and testing facilities.

I want to kick the tires and use it right now

So far, it has two implementations ready to use. The Mondrian driver allows you to run the much acclaimed Mondrian open source OLAP engine as an in-process data provider.

There is also the XML/A generic driver that can connect to pretty much anything that talks XML/A, whether it’s over HTTP or anything else you fancy using. This particular driver allows you to build applications that can switch to and from any of these OLAP engines :

  • Hyperion Essbase
  • Microsoft SQL Server Analysis Services
  • Infor
  • Mondrian
  • Palo

The Olap4j project is gaining momentum and we truly hope to see it become the standard in the Java world.