NeXTWORLD November 1993
Start Your Engines
Database servers drive the enterprise; your mileage may vary
By Daniel Miles Kehoe and Simson L. Garfinkel
The lifeblood of any organization is its data. Maybe it is clinical records of new drug trials, bond prices at the close of market, or the current balance of customer accounts. It must be readily available, and it absolutely, positively must be accurate. Otherwise, the organization grinds to a halt.
Pumping the data through the organization is a software machine – a database engine. It may crank out transactions on a mainframe, purr like a cat across a distributed network, or put-put along on a desktop PC. It provides a central repository for information that can protected, managed and backed up. It also provides access to the data by users ad other programs that need to act on the information.
Database servers all share common functionality. At a minimum, the server runs as a separate program (a process) on a central hardware platform, listening for client requests for data retrieval or manipulation. The server handles multi-user, multitask process scheduling, input and output from dedicated file space on a hard disk, caching of data to optimize responsiveness, sharing of data among users without conflict, and processing of commands for administration and manipulation of the database.
In this article, we look at industrial-strength database servers – programs running on centralized computers that users access over a high-speed network. Most of the servers included in this article support Structured Query Language (SQL), an industry-standard language for communicating between database servers and clients. Unlike a PC-based database accessing and manipulating the data, the database servers described here are designed to work with separate – and often custom-built – front-end programs. (See "Data Dashboards," NeXTWORLD, February/March 1993, for a survey of NeXTSTEP database front ends and development tools.)
In the early days of NEXTSTEP, users had limited choice among database engines. One of the leading SQL data servers, Sybase, was available to run on NeXT hardware and was bundled for a time with the operating system. Later, the market-leading vendor, Oracle, began shipping a version of its server for NEXTSTEP 2.1
With the introduction of DBKit in NEXTSTEP 3.0 (see the sidebar, "The NEXTSTEP Connection"), adapters could be easily written for any database engine. NeXT’s database strategy is based upon the idea of interoperability: NEXTSTEP works with practically everything. What was once a liability turned into one of NeXT’s best assets.
NeXT ships DBKit with adapters for Oracle and Sybase; third parties have developed their own adapters for most of the other well-established and popular commercial data servers, including Gupta SQLBase Server, IBM’s DB2 and VSAM (through Micro Decisionware’s OS/2 gateway product), INGRES, Borland InterBase, and NCR Teradata. Most of these database programs run on dedicated servers (not running NEXTSTEP), which you connect with NEXTSTEP-based computers over a network, although Gupta SQLBase Server and Borland InterBase are both also available in native NEXTSTEP versions. SofDesign Solutions’ QuickBase and Blue Rose’s Rosebase are NEXTSTEP-only data servers. Although this denies these databases access to fast RISC-based servers now, this should be less of an issue as NeXT’s Portable Distributed Object system becomes available on other platforms.
Serving up the servers
For NEXTSTEP users, this means that they can connect their front-end applications to almost any database server they choose. Or, more commonly, to the database server they already have.
Though the cost of the software itself may be modest, the costs of deploying the system, administering the database, and re-engineering applications used with the system can quickly make the software costs appear insignificant. And since the costs of changing your mind – and changing your data server – can be momentous, organizations rarely switch from one server to another once the initial purchase is made. Thus, choosing a data server has a lot more in common with choosing an operating system than with choosing an application program.
If you’re planning to use NEXTSTEP-based workstations to access information that’s already stored on your mainframe or departmental server, chances are the best database for you right now is the one you’re already using. The product sidebars with this article will tell you where to find DBKit adapters for your DBMS.
If you are just now building a new information-management system, the first decision you’ll need to make is where you want to run the engine. A variety of servers are available for NEXTSTEP now, which means that you can cut your hardware costs by running the data server on one of you workstations (or your file server). Alternatively, if you are out to replace a mainframe, you might be better off buying a minicomputer or mainframe-class workstation and using the computer just as a data vault. In this case, you can run any operating system you want on the server and connect to it from NEXTSTEP over the network.
In this report, we surveyed users of every relational database management system that can be used with NEXTSTEP. In our interviews, buyers cited various considerations for differentiating databases, including performance, features, standards, support, and price.
If all else were equal, you could simply choose your data server by looking for the company that offered the best price. And indeed, if your requirements are very simple, this may be the only analysis you’ll need. For example, you may be developing an uncomplicated application that stores a limited amount of data. In this case, you’ll be well served by SofDesign’s QuickBase. At a price of $695 for three users, it’s the least expensive data server you’ll deploy under NEXTSTEP.
Database speed is another primary concern for many purchasers. That’s because a slower database engine requires faster hardware to support the same number of users or to answer a query in the same amount of time. Unfortunately, comparing engines is notoriously difficult, since many manufacturers optimize their systems to perform different tasks better than others. Add to the complexity the fact that different servers run on different hardware platforms. Thus, determining the speed of a database is a lot more difficult than calculating the number of pages per minute you can crank out of a laser printer.
The speed of your database also depends on how you will be using it. On-line transaction processing (OLTP) is the traditional forte of the relational data server. Order-entry applications make for classic OLTP – clerks write data to the database in repetitive short bursts. If you are going to use your engine to support management applications involving browsing through large data sets, reading, analyzing, and making only occasional updates, you should consider servers that are optimized for on-line complex processing (OLCP).
Database performance is tracked by the Transaction Performance Processing Council (TPC). The TPC is a nonprofit organization funded by a large group of manufacturers of computer and data servers, chartered to disseminate objective performance data. The TPC maintains and publishes a list of all systems that have been tested with standardized benchmarks. The TPC does not run a single benchmark on data engines; rather it tests for OLTP and another for OLCP. TPC reports (available from the council at 408/295-8894) detail how fast each server runs on a particular hardware configuration. Unfortunately, different benchmarks are not always comparable: How do you compare Oracle7 running on a SPARCstation 10 with INGRES running on an HP 700?
Some users may wish to consider the size of the database server. While all servers can handle gigabytes of on-line storage, few can function with less than 10MB or 20MB. The exception is the four data server products that run natively under NEXTSTEP, which all fit neatly in less than 2MB.
Bells and whistles
Beyond these factors, each database engine offers its own advantages and special features. Here are some key terms concepts:
Structured Query Language (SQL) – Often pronounced "sequel," SQL is an industry-standard language for communicating between database servers and clients. Originally developed by IBM in the 1970s, SQL has since been formalized by the American National Standards Institute in two standards: ANSI Standard SQL and ANSI Standard SQL Level II. Unfortunately, even with the standards, there are still headaches. Most manufacturers of data-server products extended SQL with their own command statements, allowing access to specialized features of their products. Some manufacturers do not support the full set of Standard SQL or Level II statements.
Nontraditional Data Types – Traditional data types are character fields of limited length, integers, floating-point numbers, or Boolean yes/no pairs. Date and time data types are also traditional, though there can be variation in how they are formatted and stored inside the data server. Nontraditional data types, which become very important in a graphical environment such as NEXTSTEP, include memo fields and other long text data, graphics, sound, and fields for the storage of entire files.
Binary Large Object (BLOB) – Older data-server products may support "long" data types 32KB in size; other data servers support data types of unlimited size but only through pointers to external operating-system files. Ideally, a data server should provide storage for BLOBs, which are nontraditional data handled as large blocks, or objects, of binary information. BLOBs are sometimes called "opaque," since they are not interpreted by the server.
Versioning – Servers that are optimized for OLCP, such as InterBase, INGRES, Rosebase, or ORACLE7, are often implemented with a new approach, any data that is being accessed is locked, maintaining consistency at the expense of concurrency during lengthy transactions. With versioning, each user gets a snapshot of the database at the moment the transaction begins. With InterBase, for example, the only locks that are applied are ones that prevent more than one person from writing over the same data. Anyone can read data and anyone can write without having an adverse impact on performance. If underlying data changes before a write is committed, InterBase will discard the attempted change and notify the user.
Stored Procedure – Stored procedures are programs that reside in the data server and can be called from any application program. They increase performance of applications by eliminating the need to reparse SQL statements. Furthermore, by relying on stored procedures inside the data server, application developers can increase the independence of front-end applications, reducing the frequency of redeploying front-end applications as applications are improved. While most SQL servers make some provision for stored procedures, they are all implemented differently, making them mutually incompatible. NeXT’s DBKit makes no use of stored procedures, but they can be used in NEXTSTEP applications by sending SQL directly to the server.
Trigger – A trigger is a special kind of stored procedure that is executed automatically by a data server when a predefined type of change is made to the database. A trigger can edit data, perform calculations, update related data, and enforce integrity constraints.
Declarative Referential Integrity – A technique for enforcing data integrity by storing data relationships inside the server. Sometimes implemented with stored procedures, declarative referential integrity prevents the server from committing database transactions unless particular relationships remain true. Such a system might prevent a database user from deleting a department if there are still employees working in it or deleting the records of customers who have outstanding orders. DBKit cannot create referential integrity tables, but if they exist, all DBKit transactions will be moderated by them.
Notification and Event Alerts – Some data servers can alert client programs when a predefined type f event occurs. For example, with QuickBase, when one user updates a database record, all other users who have that record displayed on their screen similarly receive the update. Notification and alerts are often implemented with stored procedures and are particularly useful when applications monitor real-time processes, such as in financial trading.
Shadowing and Mirroring – A technique by which a server stores multiple copies of its database across separate disk drives, allowing for instant recovery should one disk fail. Shadowing ad mirroring do not eliminate the need for regular backups, since errors can be introduced in the database without hardware failures, and neither technique protects data from loss by fire or theft. Although mirroring is supported by some servers, it may be easier to purchase redundant arrays of hard disks that perform the same function.
Distributed Database – Also called "replicated database," this is a technique for storing a single database on several different computers at the same time, for improved performance and increased reliability in the event of a system or network failure. In distributed database systems, updates on one server get propagated to others in a timely fashion, so that all users see a consistent database picture. Several manufacturers, notably InterBase, INGRES, and ORACLE7, claim their data servers can be configured as distributed database systems.
The NEXTSTEP Connection
NeXT’s Database Kit is a framework for developing object-oriented database applications. The kit consists of a set of Objective-C classes that automatically handle the task of accessing data stored in a database, displaying it on a screen, and writing back updates if and when necessary.
DBKit’s power lies in its adapters, which are dynamically loaded modules that handle the nitty-gritty of interfacing with a particular database vendor’s network protocol.
Although DBKit allows the programmer to send SQL directly to the data server, most DBKit programmers instead use the kit’s own API. Because the adapters hide the differences between different databases, the same DBKit program can connect with a wide range of servers without modification. This makes it easier for a company to change its database in midstream if problems surface with a particular data server, it also makes it possible for third-party developers to produce a single database-browsing or –reporting application that will interoperate with any NEXTSTEP supported database out of the box.
DBKit is also a natural for companies trying to migrate their databases from mainframes to open systems. Use NEXTSTEP to access the mainframe today, and when you move to a network-based server later, you won’t need to change a single line in the NEXTSTEP applications: just switch the server.
Since DBKit is database independent, it eases the risk for some companies of trying out an unproven database from a start-up company. If the application fails, users can always switch their adapters and buy a license for Oracle or Sybase. DBKit also makes it possible for developers to create reporting and management applications that work with any database supported by NEXTSTEP, rather than having to write to a particular company’s API.
One disadvantage of DBKit is performance. Using NeXT’s object-oriented layer is somewhat slower than directly using a company’s proprietary API and client libraries supplied by the vendor of the database-management systems. If you are building a large transaction-processing system and are sure that the choice of database will never change, you might be better of developing code that accesses the data server directly. But for applications in which raw performance is less important than flexibility, DBKit’s features are without parallel in the client-server world.