www.smh.com.au

For the record, hit a fast-forward button for data with destiny

Graeme Philipson
October 23, 2007
Next

Speed, speed and more speed governs what we can and can't do with critical business data.

AT THE heart of most information systems there is a database, sometimes known as a database management system (DBMS). Databases are ways of ordering information, and they can be very complex beasts.

The simplest DBMS is a grid of rows and columns, or records and fields. A record may be of a company's customer and the fields may be name, address, account balance, etc. Many systems take a series of row and column combinations (each of which is called a table), and relate them to each other through overlapping fields. For example, customer number may be a field in the customer table but also a field in the accounts-receivable table.

Many systems built in this way can get hideously complex, with hundreds of tables. They can also have millions of records. The bigger they get, the harder it is to get at the data, and the longer it takes.

One of the key disciplines of IT management is improving the efficiency of DBMS-based applications. There are people called database architects (DBAs), whose job it is to design systems efficiently and to improve the effectiveness - and speed - of them. Applications evolve and it is a never-ending task.

Increasingly, it's all about speed. ATMs, e-commerce, airline reservation systems, security applications - all need to be almost instantaneous. Even business intelligence systems, which have traditionally used data warehouses to download transactional data into a form more suitable for querying and analysis, are moving towards real time.

For many years people have tried to speed up databases by using hardware as well as software. The traditional DBMS is a piece of software that can run on a range of different types of hardware. The best-known examples are Oracle, IBM's DB2, and Microsoft's Access and SQL/Server.

In the 1980's, the now-defunct British company ICL invented a thing called CAFS, which stood for content addressable file system. CAFS vastly improved query speeds by putting some of the database logic into the disk drive that held the data. It was widely used, and was popular with police forces.

The best-known technology today comes from Teradata, which until recently was a subsidiary of NCR. Teradata uses a technology called MPP (massively parallel processing) to query data in hardware at very high speed. It has been extremely successful and has many Australian users - most notably the big banks and companies such as Qantas, which have large online transaction processing systems.

ICL disappeared more than 10 years ago, and since then Teradata has pretty well had the field to itself.

Database vendors such as Oracle and Sybase have worked with hardware companies to develop hybrid systems that use hardware to make databases run better, but these approaches have never been totally satisfactory, because they are still based on traditional DBMS technology.

Now a new company, Netezza, has entered the market, with a radically different approach. Netezza makes a data warehouse appliance, a piece of hardware that uses off-the-shelf disk, microchip and networking technology, along with some snazzy software, to vastly increase the speeds at which data can be queried and analysed.

Like Teradata, Netezza uses parallel processing (with IBM Power PC chips), but it also uses field-programmable gate arrays (FPGAs) to perform much of the processing. These are simple chips that can be programmed on the fly to perform special tasks. Netezza uses them to implement a range of clever query algorithms it has developed and to take a lot of the load off the processors.

The technology seems to work. Last month I attended Netezza's third worldwide user conference in Boston's Seaport district just south of the city centre. Like Sydney's Darling Harbour and Melbourne's Docklands, Seaport is being rejuvenated with new hotels and apartments.

But all eyes were on the technology. I've been to lots of user conferences in my time, and I've never seen a more enthusiastic bunch of customers than at this event. The company went public three months ago, after doubling its revenues in each of the past three years.

There were more than 400 users and partners at the conference. They were treated to the company's vision of the future, a world in which all data can be analysed instantly. "Imagine a world where your telephone company can rejig your call plan in real time, based on your usage patterns," said Netezza's president Jim Baum, in his opening address.

"Imagine a world where stock items on the shelf change price depending on supply and demand, or where predictive analysis allows you to prevent customer churn before it happens." Mr Baum was referring to Netezza's increased role in the querying of unstructured data - not just the data in structured databases but also the sort of stuff that is held in word processing and email files, and even voice and image stores.

Some of Netezza's biggest clients are in government - defence and police forces, intelligence agencies - where vast amounts of data have to be analysed quickly.

As I've written a couple of times recently, the nature of information, and how we use it, is changing quickly. There is a lot more of it, we are accessing it quicker and we are using it in many new ways. That's why companies such as Netezza are succeeding.

graeme@philipson.info

Graeme Philipson travelled to Boston as a guest of Netezza.

When news happens: send photos, videos & tip-offs to 0424 SMS SMH (+61 424 767 764), or us.

FREE Generation Next Aussie Music CD and ARIA magazine this Sunday