Connect Microsoft SQL Server from olap4j

May 21st, 2009

Browsing my Google Analytics statistics, I realized there is a lot of people out there that are searching for ways to connect Microsoft SQL Server with olap4j.

Here is a nice example.
// We must use the XMLA driver.
Class.forName("org.olap4j.driver.xmla.XmlaOlap4jDriver");

// This code is for Java 5. With Java 6, you can directly
// unwrap the underlying connection with the .unwrap() call.
OlapConnection connection =
(OlapConnection) DriverManager.getConnection(

// This is the SQL Server service end point.
"jdbc:xmla:Server=http://example.com/olap/msmdpump.dll"

// Tells the XMLA driver to use a SOAP request cache layer.
// We will use an in-memory static cache.
+ ";Cache=org.olap4j.driver.xmla.cache.XmlaOlap4jNamedMemoryCache"

// Sets the cache name to use. This allows cross-connection
// cache sharing. Don't give the driver a cache name and it
// disables sharing.
+ ";Cache.Name=MyNiftyConnection"

// Some cache performance tweaks.
// Look at the javadoc for details.
+ ";Cache.Mode=LFU;Cache.Timeout=600;Cache.Size=100",

// XMLA is over HTTP, so BASIC authentication is used.
"username",
"password" );

// We can execute a query. MDX of course.
CellSet set = connection.createStatement().executeOlapQuery(
"SELECT {} ON COLUMNS FROM CUBE");

Update : Some useful links

PAT 0.3 - Integration and nifty features…

May 20th, 2009

Pentaho Analysis Tool sprint 2 is now over and released since a few weeks. We’re currently in the process of wrapping up the third sprint. What’s new? What new features will there be?

Nothing very spectacular really. Yet there are a few nice features worth mentioning.

  • Integration with the Pentaho BI server
    PAT can now run embedded in the Pentaho User Console and be configured remotely. It only supports XMLA connections for this first draft, but don’t worry; more compatible connection types are to be supported in the medium term. This also means that you can seamlessly use any XMLA provider. Sql Server? Essbase? Mondrian? You choose. All that thanks to olap4j. If you want specific details on this, Gretchen Moran has written a bunch of documentation on this. Congratulations to Paul Stoellberger and Gretchen Moran for this feature.
  • Create multiple queries at once
    This was a requirement that was passed to us by Pentaho’s engineering team. People want to build and use more than one query at a time without concurrency issues. That was not properly supported by the age old JPivot application. Now and then we encountered some problems with multiple queries, so this was something pretty high on our features list. The backend supported that since sprint 1 , but there was higher priorities for the GUI components.
  • OlapTableModel first draft
    The Java API doesn’t include a proper TableModel for OLAP data, so we’re planning to write one. We still have lots of things to figure out on this subject, but we’re planning to mock a draft specification before the sprint 3 deadline.

Our initial intent for the third sprint was to make the GUI nicer, but we’re still trying to figure out if XUL would be a better generation tool than SWT. We are waiting for developments between the GWT-Mosaic team and Nick Baker of Pentaho and hope that all those nice gents would team up and cook something up to their talent. Fingers crossed here. :)

Also worth mentioning,  we started talks to review the Query model currently residing in olap4j. This is a soon to be major component as it will provide developers and GUI enthusiasts with a properly abstracted API to build queries against a data warehouse. Olap4j is  very great API for low level stuff, but it still needs an abstraction layer for the common folks. Building queries should not require in-depth knowledge of MDX, for you cannot expect all business analysts to master MDX anyways. Anyone who wants to participate in the process is warmly invited to manifest himself. We are really looking for input on this, whatever your background is. (As long as it’s related to BI I guess…)

Pentaho Analysis Tool - Final push for sprint 2

April 13th, 2009

I’ve been really busy lately working on sprint 2 of Pentaho Analysis Tool. We almost reached all our goals for this sprint and are hoping to wrap it all up in the next two weeks. There is still time to add a few late requirements in this sprint so if anyone has a very special wish, now is the time to express it.

About Pentaho Analysis Tool (PAT)

PAT is an attempt to replace the good ol’ JPivot application, widely used in the Java world, as a web based browser for OLAP data. There are quite a few similar projects out there, yet none of them quite make the cut in terms of enterprise requirements.

  • Ad-hoc connections
  • ACL management
  • User defined connections saved for later use
  • Saved queries
  • Multiple queries editing at once
  • [ insert even more entreprise software mumbo jumbo here]

We’re writing a Google Web Toolkit (GWT) front-end and a Spring based backend as a core. All data manipulations are possible thanks to the Olap4j API. There was talk of a Json bridge later on in development, but this requirement is not part of any sprint planning for now.

The project is hosted on Google code and all the project management is done in the Jira tracker. If you have any further questions about our project or want to chat for whatever reasons, we can be reached via the mailing list or ##pentaho.pat on freenode.

Of easy and painless systems monitoring

March 25th, 2009

I’m not a systems administrator. I only have 8 servers to babysit and it used to be enough to be a time consuming problem. You might not be a systems administrator either, nor have many machines / services / websites to monitor, yet the fact remains that as IT professionals we need to keep a close eye on what’s going on. I’m not talking about 99.999% uptime here, but a 1% downtime is enough to make a lot of customers, clients and managers angry; especially since outages have a way to happen exactly when it should not.

What are your options? How much does it cost? What can you monitor? These are all questions I’ll try to shed a light on. The solution I’m proposing today is one I used myself for years. I’m not legally obliged to 5-9 availability, yet this is what I achieved with a total cost of 0. Yep, z.e.r.o. zero. El zilcho.

I’m not saying this will work for anybody, neither am I pretending to be an expert on the issue at hand, but I learned a lot in a few years on the subject so here it is.

Read the rest of this entry »

Evading (D)DOS attacks with Apache HTTPD

March 16th, 2009

Just a quick tech-tip. Ever wondered how to prevent your HTTPD server from being knocked off the net by a DOS (Denial Of Service) attack? Check this nifty little module.

Mod Evasive

Its pretty easy to setup. Compile the module as you would normally do for HTTPD modules and create a configuration file. There are many options available. Here’s an example of how to configure it.

<IfModule mod_evasive20.c>
DOSHashTableSize 3097
DOSPageCount 6
DOSSiteCount 100
DOSPageInterval 2
DOSSiteInterval 2
DOSBlockingPeriod 600
DOSEmailNotify “my-monitoring-contact@domain.com”
DOSWhitelist  192.168.*.*
</IfModule>

More details on the configuration and how each parameter will affect the module behavior can be found out there on the net.

Beware though, before installing this, make sure you won’t blacklist some legitimate users. For example, if you have a AJAX application that sends a burst of requests once in a while, it might get blacklisted. Make sure you test it in a development environment so you get the thresholds right.

Of economic opinions and commentators

February 12th, 2009

There is a lot of blogging being done out there in these times of economical crisis — yes, it is a crisis. Most of it is utter garbage, mixing opinions with specifically picked facts to serve a given purpose, yet somehow,  I still find it important to read it all. There is an old saying that goes something like : “Fool is the one who ignores what he considers not worthy, for wise is the one who can learn from anything.”

Most of the wisest things i read were published in my favorite monthly publication, Le Monde Diplomatique. On the counterpart, a hellish lot of garbage can be found pretty much anywhere. Then again, during one of my many news scavenging sessions, I was genuinely surprised to find this little post from a man i never noticed before. I do believe this man has a proper sense of economical and political analysis. Here’s an excerpt.

US policymakers have ignored the fact that consumer demand in the 21st century has been driven, not by increases in real income, but by increased consumer indebtedness.  This fact makes it pointless to try to stimulate the economy by bailing out banks so that they can lend more to consumers.  The American consumers have no more capacity to borrow.

With the decline in the values of their principal assets–their homes–with the destruction of half of their pension assets, and with joblessness facing them, Americans cannot and will not spend.

Why bail out GM and Citibank when the firms are moving as many operations offshore as they possibly can?

(…)

The US government really has only two possibilities for financing its budget deficit.  One is a second collapse in the stock market, which would drive the surviving investors with what they have left into “safe” US Treasury bonds.  The other is for the Federal Reserve to monetize the Treasury debt.

Monetizing the debt means that when no one is willing or able to purchase the Treasury’s bonds, the Federal Reserve buys them by creating bank deposits for the Treasury’s account.  

In other words, the Fed “prints money” with which to buy the Treasury’s bonds.

Once this happens, the US dollar will cease to be the reserve currency.  

In addition, China, Japan and Saudi Arabia, countries that hold enormous quantities of US Treasury debt in addition to other US dollar assets, will sell, hoping to get out before others.  

The US dollar will become worthless, the currency of a banana republic.

I’ll keep on the lookout for more interresting articles on this. I beleive that the current economical difficulties are of enoumous importance to us all. Not only are we at risk of loosing big, decisions will soon be made that will dictate the governance of our everyday life for decades to come. I might don’t think much of the last decades of governance we just endured, but I certainly won’t fallback to cynicism and apathy.

Comments? More reading suggestions?

Of OLAP and the importance of open standards

November 11th, 2008

In these times of economical crisis, many companies will turn to business intelligence (BI) as a source of wisdom and counsel. Millions of dollars will be invested in an effort to understand the extend of their respective problems and find solutions based on accurate and decision oriented datasets.

Since I have a fairly good amount of experience with work in heterogeneous environments and tackling data integration challenges, I thought I’d pitch in my two cents.

Why developers and project managers will have a hard time

The root of the problem is this. The Microsoft OLAP toolkit does not integrate so well with anything else than .NET technologies. SAS offers a Java API, yet it is not ready for production. (I worked with it for two years, and believe me, they are still a fairly long way to production quality code.) As a matter of fact, most software vendors in the OLAP world distribute some API to integrate their technologies, but you often end up with black boxes of questionable quality, flexibility or performance. Some even go as far as to obfuscate their libraries… this really doesn’t help in the end.

Some vendors like Oracle went for the all-in-the-box solution. They offer a “complete” solution that can fit every possible need. Then again, what they are telling you is: if we don’t have it, you probably don’t need it.”Probably”? You got to be kidding. Since when does software vendors know what you need and what your future will be? Better switch probably for hopefully.

In the best case, in order to meet your needs, you’ll hack your way through at the expense of your project specifications. The final result can be nothing but deceiving. Your celebration will be bitter and probably short lived, I fear.

About the importance of collective work

You have a brand new application. Hooray! This is where the production phase kicks in.

What if you need to move your datamart to another OLAP server? What if there are not enough connections licenses to allow both production connections and all of your maintenance personnel on the OLAP server and they are forced to take turns to debug? What if the CEO decides to migrate to a new platform? What if [insert random but oh so frequent unforeseen event here]? Your thousand dollar code is now rendered useless; you can start crying now, you deserve it. In your quest for more money making, you’ve created a monster that was expensive and will continue to pump the money out of your institution pockets.

If you were good enough in systems design, you thought about a data layer. The data layer still remains to be rewritten entirely and it often represents at least a third of the overall effort. Close, but no cigar. This might sound like a catastrophic scenario, but it is oh so frequent.

Many people got tired of all this non-sense we decided to work together. We decided that enough time and money was wasted on individual efforts that were ruined in the end.  It was time to agree on standards and share the product of our collective effort.

Take Hibernate for example. It is now a de facto standard when it comes to data mappers. For the Java version alone, it represents 859 thousand lines of code worth 12.8 million dollars in work hours. Think you can top that with your in-house data layer in times of economical crisis?

About Java OLAP

OLAP is a world in itself. You can’t take relational paradigms and apply them to the multidimensional world. The .NET toolbox does have very nice libraries to do some neat OLAP stuff, then again, you’re locked-in with SSAS. This is a no-no.

On the Java side, things are even worse. There is currently a big void in the Java OLAP market. No OLAP standard emerged at all. Thanks to the selfishness of the big players of the industry, the JOLAP initiative was a total failure. It never reached the final version, so the JSR-69 specification died quietly.

We at Olap4j tried to fill that gap with an open initiative. Everyone can pitch in. And I mean EVERYONE.

What makes Olap4j so kewl

You know the expression vendor lock-in? I hope you do, I *really* do, or else you’ll learn it the hard way. Olap4j aims at solving exactly this problem. You can develop applications on it’s API and switch the underlying OLAP engine without rewriting a single line of code. Not bad heh? Olap4j is more than a database driver. It is an open API built right on top of the JDBC industry standard where everyone collaborates to specify a common base onto which to build.

It even includes transformation libraries and testing facilities.

I want to kick the tires and use it right now

So far, it has two implementations ready to use. The Mondrian driver allows you to run the much acclaimed Mondrian open source OLAP engine as an in-process data provider.

There is also the XML/A generic driver that can connect to pretty much anything that talks XML/A, whether it’s over HTTP or anything else you fancy using. This particular driver allows you to build applications that can switch to and from any of these OLAP engines :

  • Hyperion Essbase
  • Microsoft SQL Server Analysis Services
  • Infor
  • Mondrian
  • Palo

The Olap4j project is gaining momentum and we truly hope to see it become the standard in the Java world.

Chain blog… they cought me.

September 24th, 2008

If you’re reading this, you’ve been included in this blog chain.ChainBlog picture

 

  1. Take a picture of yourself right now.
  2. Don’t change your clothes, don’t fix your hair… just take a picture.
  3. Post that picture with NO editing.
  4. Post these instructions with your picture.

I was caught by three techno blogs I follow, so don’t be mad at me, Thanks to Julian, Matt and Nick. I had no webcam at hand but i thought about my phone, so there it is: Luc at the office Wednesday morning, before his coffee.

Cheers!

Data integration challenges tackled

August 11th, 2008

logo_kettle_lrg.pngData integration in business environments can be a painful task. I mean REAL painful. The volume of data is huge, it does not cross-validate, it is dispersed in many heterogeneous formats, yadi yada. You know the song. Some day, I stumbled on Pentaho Data Integration (PDI).This was a real breakthrough.

First thing first, it’s not subject to “vendor lock-in”. It can read most data formats out there and can write it back to pretty much anything. This is a huge plus because gives it the ability to be used by a plenitude of user types and environments. Being written in Java also gives it an edge as an enterprise tool, for it is platform agnostic.

But the real advantages are not those trivial specifications. My love for PDI has much deeper roots. Simply put : it’s powerful. Creating an integration process is a trivial matter. Drag and drop. Link. Execute. Those three simple steps will cover most of your business needs. Really, I mean it. Never again will I write a snippet of code to read a CSV file and write it’s content in a database. Mark my words; NEVER! This is a waste of time and a developer who lives with his times should know that.

What about the real juicy stuff ?

As you suspected, there is much more to PDI than meets the eye. It can be clustered, it can use a database based repository for all processes, there are automatic documentation generation tools and is supported by a huge community. Many tutorials exist to address most business needs and challenges. It’s well made, very stable and easily expandable with plugins for power users.

I strongly recommend to give it a try. The next version should be released soon and it will include many great new features. I met Matt Casters last June and had the chance to see for myself all the new functionalities that will make it to the next release. We’re talking about visual performance bottleneck exploration and some more neat stuff you won’t find anywhere else.

Cheers, and have a good time integrating !!

Pentaho on the iPhone

July 29th, 2008

iphone_pentahoI successfully added the iPhone extension to my Pentaho platform today and I was more then impressed with the ease with which we can enable the whole platform to work seamlessly on those nifty little phones.

Oh yeah, I bought an iPhone too…

I’m slowly discovering the fun of having a cellular phone in my pocket. This is something that I never experienced before; never had a cell phone. I have to say that I’m glad it’s a good phone, and sexy too.

The bottom line is : get one.

For those interested, here’s the wiki page that says it all. Thanks to Will Gorman, senior developer at Pentaho,  who put this all up.