Building
Support Software
Benchmark
Concepts and Design

his chapter covers some of the more critical reasons for benchmarking and highlights why this type of testing is so important. Some ideas are then presented as to how you can go about developing a benchmark suite for your own application, including the provision of some useful software tools for speeding up the process.
3.1 Why Develop a Benchmark?
First of all, we should get some definitions straight. This section discusses
benchmarks, but this term is used in a very generic way. There are many reasons for running benchmarks, such as:
· Platform selection/platform changes
Only the last item in this list, and the performance part of capacity planning, could be classed as true benchmarks, because the others do not compare system performance. In order to keep things clear, the term
benchmark will be used when referring to the testing, whereas
simulator and
simulation will be used when referring to the actual test software.
The importance of running benchmarks cannot be overstated. Running Oracle on a UNIX server offers many advantages over mainframe systems, and it is frequently a cost-centric argument that is used to sell the systems. However, the real benefit of this combination comes in the form of rapid evolution. Major releases emerge at intervals of 6 months to 12 months, and the processor and interconnect architectures leapfrog each other every 6 months. With the Oracle tools protecting the application developer from most of the porting effort associated with changing platforms, the price/performance ratio can be progressively lowered during a system's life span. Of course, this actually happens very rarely, because new functionality is always "needed" just as extra headroom is obtained. Either way, the rapid development of software and hardware provides tangible business advantages.
Unfortunately, the rapid change element does not come without occasional problems. Owing to the flexibility of the programming model and the rapid rate of change in the hardware and software, any single system can be exposed to literally millions of different states. Combined with the financial pressures that the vendors are under to get new releases to market, it is not unusual for unexpected problems to slip through their testing programs.
In smaller systems, most software and hardware changes can occur transparently because the demands that the application makes on the system are minimal. Large-scale systems that push the hardware and software in unique and extreme dimensions cannot rely on this kind of plug-and-play approach. In fact, in order to implement any significant changes at all, the target system should be tested as extensively as possible with the real application. This is the only way to eliminate major problems before they occur in a production environment.
Capacity planning is a very ambiguous term and yet actually suits this category quite well. Capacity testing should be considered in the following situations:
· Initial platform sizing
· Increased user count or increased transaction rate
· Proposed application/data changes
The initial sizing of a platform is potentially the most complex, because little information is available to aid the process. Frequently, such sizing is made far more difficult by the fact that the application development has not even been started at the time the systems need to be purchased, and so no information about the operation of the future system is available. Unfortunately, there is no easy solution to this problem, although some pointers on building a reasonably accurate simulation in these circumstances can be found in
Section 3.5.
If the application has already been developed, things are considerably more straightforward. If the application is a well-known package such as Oracle Financials, then it becomes easier still, because most hardware vendors have specialized expertise in these products and extensive experience regarding how they run on their particular platform. The only word of caution here is that most of these packages-particularly financial packages-have to be tailored to the particular business or accounting structure of the company, which can significantly change the operational profile of the application.
Whether the application is brand new or an off-the-shelf package, some kind of testing should be performed as a sanity check for the paper-based sizing exercise. In the case of a new application, extra caution needs to be exercised. It is not unusual for an application "proof of concept" to establish a system sizing that differs by an order of magnitude from the actual requirement.
Increased User Count or Increased Transaction Rate
A change in the size of the user community or an increase in transaction rate is relatively simple compared with the initial sizing exercise. The application profile is well known by this stage, and the footprint of each transaction can be recorded quantitatively-which is a major advantage. However, it should be clear by now that doubling the capacity of the system, for example, does not necessarily allow a doubling of user count.
Increasing the number of users and increasing the transaction rate are both common in an expanding business. Acquisitions can increase user population, and simple increases in business volume resulting from company growth can produce a steady increase in transaction volume. While the former is rarely a surprise, an increase in transaction volume is frequently overlooked and needs to be planned for in advance whenever possible.
The goal of this type of testing is to verify that, for a given release of the application code and a resized system configuration, the new user base or transaction volume will be able to operate within the contracted response times. As long as the right types of metrics are available from both current production and the simulator, this should be a relatively simple exercise. There are basically two metrics that need to be followed:
·
Throughput, a.k.a. transactions per second. Throughput will be altered in either of these situations. The throughput of the system would be expected to go up with the addition of more users. The estimated increase in transaction rate should be ascertained from business data prior to testing, in order to test the system at the correct load.
·
Response time, or user-perceived response time. The response times of the system should not exceed the contracted times.
Knowledge of how the existing system runs is essential for this type of testing. In addition, baseline testing should be run with the simulator. During these tests, the benchmark system should be loaded at the same level as the current production system. In this way, the loading imposed by the simulation software can be compared directly against the production system at an identical transaction load. This is an important test to perform, because benchmark workloads are almost always too clinical in profile when compared with the natural randomness encountered in production. Using the baseline test results, more accurate estimations can be determined for production system utilization.
Proposed Application/Data Changes
Can we run yield analysis concurrently with online users during the day? Can we add application-level auditing? If we add three times more products to the portfolio, does this impact the response time of a transaction? These are the types of questions that are frequently asked during the life cycle of an application. They are fairly significant questions with no easy theoretical answers, and there are severe penalties if they are answered incorrectly.
These questions are most accurately answered using the application simulator, particularly if remote terminal emulation is employed (see
Section 3.6). Once again, it is important to establish a set of baseline results that show how the simulation software ran prior to the change. The benchmark tool can then be executed in the new configuration, and the results of the simulation can be directly compared with those of the baseline test.
Some application changes, such as the integration of a transaction monitor, are too major for this kind of testing. Changes of such proportions should be considered as new developments, because they change the entire profile of the system.
The types of upgrades discussed here are operating system, Oracle, and hardware upgrades. While maintenance releases, such as an upgrade from Oracle 7.3.3.3 to Oracle 7.3.3.4, typically address only known problems, any release that contains feature code should be viewed suspiciously. With the Oracle RDBMS, any change in the first two fields of the version number denotes the implementation of feature code. Occasionally, "minor" feature code is implemented in maintenance releases. This happens most often shortly after a large implementation of new features, because the developers have missed the freeze date for checking in their new feature code.
With the operating system, the risk associated with each release varies from vendor to vendor. Careful attention should be paid to release notes prior to planning an upgrade. The same is true of Oracle releases, to ensure that the right steps have been taken to minimize the risk of upgrade. With hardware, certain upgrades can be performed without testing. Typically, processor and memory upgrades can be implemented with minimal risk. The things to watch are new I/O subsystems, firmware upgrades, and completely new architectures.
The bottom line of all this is that when you are building or maintaining a large-scale database system, you need to be scared of
bugs, because they will be evident in any nonmaintenance upgrade. Don't kid yourself: if you think the new release will go without problems, you are being way too optimistic.
It's a good idea to approach upgrades as follows:
· Don't upgrade until you have to, unless the new release contains fixes of critical outstanding problems that cannot be backported.
· Upgrade one thing at a time. Maximize change control, and increase the chance of keeping at least some of your upgrades in place, should problems occur.
· Bundle testing together. Minimize the time you spend testing releases, and bundle the testing of several products (such as Oracle and O/S) together. As long as you test all the combinations that you will pass through in the upgrade cycle, you should find this a low-risk optimization of your time.
The important point here is to test the new releases. Neither Oracle nor the operating system vendor will have tested the very particular set of dimensions that
your application will test. Therefore, your system is fairly likely to hold the key to exposing unforeseen race conditions, system states, and so on, and it is your responsibility to find such problems before they hurt you in production. With major Oracle upgrades, such as from Oracle7 to Oracle8, there is little opportunity to reverse the process in the event of a major problem after the upgrade, because data changes occur in the database structure that cannot be undone. This makes the testing process more important than ever.
Testing a new platform is a common use of a benchmark. This is a really great way to ascertain the
real potential of a new machine on which you intend to run your system. Although you can study the SPEC numbers for the platform, factor in the latency and bandwidth of the system interconnect, and maybe even take into account the CPI (clocks per instruction) count of the processor, you will probably still achieve, at best, only a ballpark estimate of a new system's potential. Running a quantitative benchmark of the system with the actual application is the only way to ascertain the effectiveness of a new platform. There are too many other variables, not least of which is the scalability of the O/S, that can drastically affect the real capability of the system.
Normally when testing new platforms for a system, you are primarily interested in testing the database server. This, after all, is the component of the system that has to provide the most scalability and is also the most variable in terms of pure workload. Testing of support systems, application servers, and so on is not essential in a platform change benchmark, although it could be argued that such testing is still worthwhile even if only to verify the entire system in operation on the new platform.
3.2 What Are We Simulating?
3.2.1 Defining Application Areas
It is important to have a good understanding of your application when developing a simulator for it. The first part of such an understanding is being able to identify the different types of users accessing the system, how they are using the application, and how many of them are doing so. This is important in order to be able to understand what is going on in the simulation runs and to be able to comment on why things are occurring that may not have been expected.
3.2.2 Defining the Application Split
Once you have a good understanding of the overall application areas, you need to start defining functional areas of the application. In practice, the split is never black and white-the application will, by nature, share data among application areas. That is the whole reason that it accesses a common database in the first place.
However, there will be clear boundaries between the functions defined in the applications, and probably among the day-to-day duties of the users on the system as well. For example, if a hotel had a single system to cater to all of its needs, it would include the following functional areas that can be classified as distinct areas:
· Front desk (guest check-in and check-out operations)
Although there is a strong link among all of these areas, they can be individually classified as different functional areas.
Once the split has been defined, work can begin on each area. Specifically, the process should look a little like the diagram in Figure 3.1.
3.2.3 Careful with Those Assumptions, Eugene
The "Significant Workload?" decision is a difficult one. It is very easy to make a simulated molehill out of a resource mountain, and this could compromise the entire result set of the benchmark. It's important to remember that one query can destroy a system, so although only a handful of users may be performing this type of transaction, make sure that the function they perform is definitely insignificant before discounting them from the simulation.
In addition, when an application has functional areas that overlap, it is sometimes necessary to build some types of transactions simply to feed other transactions with dependent data. In this case, either the transaction needs to be implemented or the data needs to be prebuilt into the database (see Section 3.6).
Once all the significant application components have been defined, scenarios need to be described for each transaction type. For some transactions, there will be only one way of completing the transaction, and so this will equate to a single scenario for that transaction type. More typically, there will be potentially hundreds of scenarios that could be defined.
In our hotel example, a hotel guest could have a reservation, have no reservation, have any number of frequent flyer cards, and/or select one of many different types of rooms. In such situations, it is not practical to build scenarios for all possible combinations, owing in no small part to the difficulty of providing the data required to execute them. A realistic sample of the scenarios should be adopted in this situation.
The final pieces to be created are the actual simulation and the data required to run it. These pieces will be covered in the next section.
3.3 Selecting a Development Approach
It is important to choose the most suitable development approach for your application and your environment. There are two approaches for building benchmark software:
1. Remote Terminal Emulation (RTE) software
2. Completely custom development
Each of these approaches has its merits, not only in terms of cost but also in terms of useability in different scenarios.
3.3.1 Using Remote Terminal Emulation Software
RTE software is a specialized piece of commercial software such as Performix or preVue from Rational Software. Good results can also be obtained using Expect, a public domain utility
1 that provides many of the same features but uses an interpreter instead of compiled code.
There are several different approaches that can be taken to simulate the load on the database server:
· Dumb terminal emulation
· Client/server network protocol
· HTTP (Hypertext Transfer Protocol) Web simulation
The common attribute of these approaches is that they all simulate real users accessing the system. The correct tool to use depends on your application architecture. Dumb terminal emulation works very well for simulating the use of a character-based application, regardless of how it was written or what happens at the back end of the application. The client/server emulation traps the network communication between the client and the server, and subsequently simulates the load in this way, including three-tier architectures. The HTTP emulation simulates hits on a Web server, which in turn accesses the database.
This type of software has the following advantages over writing your own simulation of the application:
· Rapid development of simulation scripts owing to supplied tools
· Ability to simulate application servers as well as database server
· Very accurate simulation of load on database server
· Less susceptibility to application code changes when compared to custom simulators
The only disadvantage of RTE software, in fact, is that it costs money.
One thing to bear in mind when looking at RTE software is that it presents the same problem of data generation that you would face in a custom simulation suite. None of the available products has the capability of knowing how your application needs to be driven. Ask your vendor about what kind of infrastructure they provide for the type of data generation you will need to be doing.
3.3.2 Custom Simulation Development
The home-grown approach to simulators involves building the code from scratch, using the Oracle Call Interface (OCI) to gain the most control available over what is executed on the database server. This approach has the following advantages over the RTE approach:
· Does not need a completed application to run
This approach also has several disadvantages:
· Much longer development time
· Needs redeveloping every time the application changes
· Does not test application servers
· Not as accurate as RTE simulations
Despite the certain disadvantages of developing the simulation from scratch, there are occasions where it still makes more sense. If the application does not yet exist, for example, it is the only choice. Also, if the simulation does not have many scenarios or much complexity, it may be more cost-effective to develop it without the aid of RTE software. For these reasons, this type of development cannot be discounted and will be covered in this chapter.
3.4 Building a Simulator Using RTE Software
An important part of the value of using RTE software is the availability of a variety of tools to assist in the process. These tools include
· Transaction capture tools
· Generalized programming interface
· Statistical analysis tools
These tools provide a fast path to a simplistic simulation of your application. In a matter of minutes, a simple simulation of your application can be running and the results analyzed.
The single most important attribute of RTE software, however, is that it simulates the real user, utilizing the same kind of input as the actual production workload. Sometimes the importance of this attribute is not apparent.
Using, for example, Pro*C, which has long been the favorite for benchmark applications, the database server is (mostly) communicated with at a SQL level. This allows SQL to be submitted and answers to be retrieved. Although this simplistic approach seems good in theory, in practice more control is required. SQL*Forms, for example, communicates with the database at the UPI layer a lower lever, Oracle-proprietor communication protocol. This allows more control over the opening and closing of cursors, the steps of execution, and subsequently the way that memory is allocated on the server.
The consequence of this is that a Pro*C emulation of the application will not produce the same work profile of the server as the real application, even if the statements are issued in the same order and with the same bind variable assignments. RTE software provides true emulation of the application's usage of the database server, complete with all its horors and triumphs.
Although RTE software provides an accurate way of simulating the application, it provides no benefit from a data perspective. It is most common for an application to be expecting certain data from the user. This may be as simple as a name, a reference number, or a date, but input is expected nonetheless. During normal use of the application this does not present a problem, because the user of the system is presented with this data from some source such as a customer in the store or paperwork on the desktop. In a simulated environment, this data is not available through these means, and so it needs to be generated in some way. This is the problem that is common to both methods of simulation.
3.5 Building a Custom Benchmark Suite
There are two common starting points in building an application simulator from scratch: situations in which the application (a) has been developed and (b) has not been developed.
3.5.1 Programming Environment
A variety of programming tools are available for use in developing a custom benchmark suite. While these tools have various merits of their own, it is recommended that the Oracle Call Interface (OCI) be used for this type of development. OCI is recommended for any programming exercise in which simple but absolute control is needed over actions requested of the RDBMS, because it is the closest thing to the explicit server instruction set in the UPI interface. OCI has a widespread reputation for being "hard," but this is really not the case. There are two things that make using OCI "hard" in comparison with standard C programming:
1. You need to know the steps involved in SQL processing from the server perspective.
2. You need massive parameter lists for each function that are mostly unused in C.
The truth is, you need to know the steps of SQL processing anyway in order to support a large system, and the parameters list problem can be easily fixed using macro wrappers for each OCI call used to eliminate redundant parameters. Once these obstacles are out of the way, OCI is in many ways more straightforward than Pro*C, because no precompilation is necessary.
3.5.2 When the Application Has Not Been Written
In order to reduce the degree of error in building initial simulations before the application has been developed, it is important to learn from some painfully gained experience. Most of this falls squarely in the "common sense" category, but it is surprising how much of it is frequently missed. When the application has not been written, the use of RTE software is clearly out of the question, and so a call-centric approach needs to be taken instead of an application-centric one.
First of all, it is important to get as close to the final transaction profile as possible, and when the application is not written this can be difficult. However, certain pieces of the application environment must be at least partially complete before it is worth conducting a preliminary benchmark. The specific things to look for are as follows:
·
Final data model. Unless the data model is virtually complete, the simulation will not be accurate. It does not need to be polished, but the basic structure of the model should be final. A classic stumbling block in this area would be where the data model becomes more normalized than the simulation, thus adding a good deal of complexity to several of the queries in the application. This could result in an undersized system.
·
Query profile. How many queries are executed within each transaction, and what do they look like (primary key single-row fetches, index-driven range scans, etc.)? Don't neglect "list-of-values" processing either, which can very quickly become a major part of the system.
·
Insert/update/delete profile. Which tables do we insert/update/delete, and how many rows are processed each time? Where do these activities occur in relation to the query workload and the commits/rollbacks?
·
Stored procedures. To what extent is application logic handled with database stored procedures? This is really hard to simulate adequately, because it is likely that you will be several months (or years) of development time away from knowing the answer to this question. It is a bad thing to have significant processing within stored procedures, in my opinion, because the primary goal of a stored procedure should be to reduce network round trips between the client and the server by bundling multiple SQL interactions into a single SQL*Net call. In fact, as your application is not written yet, you still have a chance to try to make sure that stored procedures are used for the right reasons.
3.5.3 If the Application Exists: Trap and Emulate All SQL Calls
When the application is in some kind of executable state, it is possible to capture all the calls to the database and replay them in a simulated environment during the benchmark. This is more straightforward in a two-tier environment, where there is a one-to-one relationship between the requests made through the application and the process executing the SQL against the database. In a three-tier environment, there is more complexity, but the same principles apply.
The starting point is to find a person who really knows the application and how it is used in reality. The perfect person to do this is a real-life user of the application, who can show you ways of operating the application that you, the developer, or the instructor could never imagine. This is as close to reality as you can get. If this is not an option that is available to you, then an expert developer is your only option, although the simulation will then take a more clinical, how-it-should-work type of model: try to make sure that you find some way to inject the "wild-card" factor into it.
For each application area that is defined, define the types of transactions that can be performed within it, and run through each of these screens while tracing the calls made by the session, as detailed in the next subsection. It's important to include the following actions when building the list of tasks performed by the system:
When running through each of the screens, it is very beneficial to use recognizable values for the fields that will need to be generated by the simulation, rather than using the explicit values used in the walkthrough. Values such as `
RND3HERE','
GEN2HERE' and `
12345','
99999' generally work well and can dramatically speed up later steps.
Tracing Sessions by Setting Events
Most people know of the "
alter session set sql_trace=true" technique for tracing the calls made from an application. This is a useful technique, but not as useful as it could be, because it does not contain the bind variable information for each cursor.
In addition to this well-documented approach is an undocumented method of tracing the activity of a session, a considerably more useful one for this type of activity. While useful for determining database access for use in a simulator, this method of tracing also represents one of the most powerful tools available for diagnosis of performance problems as they happen. For this reason, a full user's guide is required for this function, and so we will invest some book space accordingly.
The method for activating the improved tracing is the use of a database
event. An event is a signal to a database session that alters its behavior according to the definition of the event. The definition of the event is determined within the kernel code of the Oracle RDBMS and is not documented beyond brief descriptions in the message file. The message file is the same file that Oracle uses to look up NLS-specific error messages, such as "
max # extents (50) reached in table SCHEMA.TABLE" and, in the case of English language systems, this file is
?/rdbms/mesg/oraus.msg. Embedded in this file are the numbers and descriptions of the events that are defined within that particular Oracle release.
A word of caution, however: many (most) of these events are not for public consumption. For example, event 10066 simulates a file verification error:
10066, 00000, "simulate failure to verify file"
// *Action: level is file number to fail verification
|
These types of events are used internally to Oracle for the simulation of problems. For this reason, it is wise to stick to events that are known to be safe.
The event that we are interested in is 10046:
10046, 00000, "enable SQL statement timing"
|
This event is set with the following syntax:
10046 trace name context forever, level <x>
|
The level, or degree, of the tracing is set using numbers 1 through 15 in place of the
x. The level number corresponds to the setting of particular bits within the range 1 through 12 (
Table 3.1).
For this event, only bits 3 and 4 are used. With the event enabled for any level, the output includes the same attributes as
sql_trace=true. When bits 3 and 4 are set, additional information is returned that is extremely useful. Try using level 4 for just bind variables, level 8 for just wait states, and level 12 for all of the above.
The event can be set at either the
instance level or the
session level, and the event can be set by any one of three distinct methods:
1.
alter session set events "10046 trace name context forever, level x" (personal user session level)
2.
event="10046 trace name context forever, level x" in the
init.ora (instance level)
3. On a process-by-process basis using
svrmgrl oradebug (other user session level)
Method (
1) is the most straightforward method of setting the event on the correct session, but it is useful only if this call can be made by the application session as and when required. It involves the connected session issuing a SQL call, as detailed above, to set the event on itself.
Method (2) is useful when performing tracing en masse, because it operates at the instance level, forcing all sessions to set this event at start-up. This is the most appropriate method to use when performing application capture, because it eliminates the manual activation steps. It requires the event to be set in the
init.ora so that all sessions are affected by the change.
Method (
3) is used to set the event on any process from another session. This method is most useful for reactive problem diagnosis, because it can be set on any user session that is connected to the database and shows a consistent view of the user state, including calls and wait states. A further digression may be useful at this stage to cover the use of the "
svrmgrl oradebug" functionality that is available in releases 7.3 and above.
The
svrmgrl product that is shipped with the Oracle RDBMS contains a set of functions that were previously available only through a tool called
oradbx.
Oradbx was a binary that was shipped with prior releases of Oracle and needed to be explicitly built using the supplied makefile. This binary was not created as part of any standard installation . For this reason,
oradbx was not widely known in the Oracle community.
Now that the functionality is provided as part of
svrmgrl, it is available on every Oracle system without any manual intervention. The commands are accessed using the
oradebug prefix, as demonstrated below, to gain help on the facility:
HELP [command] Describe one or all commands
SETMYPID Debug current process
SETOSPID <ospid> Set OS pid of process to debug
SETORAPID <orapid> ['force'] Set Oracle pid of process to debug
DUMP <dump_name> <level> Invoke named dump
DUMPSGA [bytes] Dump fixed SGA
DUMPLIST Print a list of available dumps
EVENT <text> Set trace event in process
SESSION_EVENT <text> Set trace event in session
DUMPVAR <p|s|uga> <name> [level] Print/dump a fixed PGA/SGA/UGA variable
SETVAR <p|s|uga> <name> <value> Modify a fixed PGA/SGA/UGA variable
PEEK <addr> <len> [level] Print/Dump memory
POKE <addr> <len> <value> Modify memory
WAKEUP <orapid> Wake up Oracle process
SUSPEND Suspend execution
FLUSH Flush pending writes to tracefile
TRACEFILE_NAME Get name of tracefile
LKDEBUG Invoke lock manager debugger
CORE Dump core without crashing process
UNLIMIT Unlimit the size of the tracefile
PROCSTAT Dump process statistics
CALL <func> [arg1] ... [argn] Invoke function with arguments
|
Once again, this is a facility that should be used with extreme caution: the database can be corrupted beyond all repair if some of these commands-most notably POKE, SETVAR, and CALL-are used. In addition, requesting one of the background processes to perform an extensive dump will effectively suspend the real function of that process for that period, possibly hanging the database.
The commands in which we are most interested at this stage are
SETOSPID should be the first command issued, as it identifies the Oracle process (UNIX process) on which the tool should operate. This information is available from
V$PROCESS, joined to
V$SESSION on the
PADDR column. Next, the
UNLIMIT TRACE directive should be used to remove the limits set within the
init.ora for restricting the size of a tracefile. This makes sense, because you will manually turn off the trace when you have all the required information. Finally, the event can be set, as detailed above. From this point forward, a tracefile will be created in
user_dump_dest, of the form
ora_PID.trc. To be sure of locating the correct file, the
TRACEFILE_NAME directive can be used om
oradebug.
Whichever method is most appropriate for tracing, it can be turned off again on the session level by using the following command:
10046 trace name context off
|
Using this event provides the following advantages over the
sql_trace=true method:
· It can be set for a running session without the user being aware of it, using the
svrmgrl oradebug facility.
· It shows wait states that the execution of each statement goes through.
· It details binding information for each cursor.
The bind information can be used in generating the simulation code, and the ward states can be used in the tuning of application SQL.
Here's a sample piece of output from a 10046-generated tracefile, using level 12 to get all available information:
PARSING IN CURSOR #1 len=50 dep=0 uid=20 oct=3 lid=20 tim=3802520068 hv=650761597 ad='c12dc1f0'
select count(*) from some_table where some_col=:b1
PARSE #1:c=8,e=10,p=2,cr=29,cu=1,mis=1,r=0,dep=0,og=0,tim=3802520068
bind 0: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03
bfp=400e7700 bln=22 avl=02 flg=05
EXEC #1:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=3802520068
WAIT #1: nam='SQL*Net message to client' ela= 0 p1=1650815232 p2=1 p3=0
WAIT #1: nam='db file scattered read' ela= 1 p1=1 p2=4752 p3=5
WAIT #1: nam='db file scattered read' ela= 0 p1=1 p2=4882 p3=2
FETCH #1:c=1,e=1,p=7,cr=7,cu=3,mis=0,r=1,dep=0,og=4,tim=3802520069
WAIT #1: nam='SQL*Net message from client' ela= 0 p1=1650815232 p2=1 p3=0
FETCH #1:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,tim=3802520069
WAIT #1: nam='SQL*Net message to client' ela= 0 p1=1650815232 p2=1 p3=0
WAIT #1: nam='SQL*Net message from client' ela= 0 p1=1650815232 p2=1 p3=0
STAT #0 id=1 cnt=0 pid=0 pos=0 obj=0 op='SORT AGGREGATE '
STAT #0 id=2 cnt=414 pid=1 pos=1 obj=0 op='FILTER '
STAT #0 id=3 cnt=414 pid=2 pos=1 obj=1876 op='TABLE ACCESS FULL SOME_TABLE'
WAIT #1: nam='SQL*Net message to client' ela= 0 p1=1650815232 p2=1 p3=0
WAIT #1: nam='SQL*Net message from client' ela= 0 p1=1650815232 p2=1 p3=0
|
In this tracefile extract, there are examples of all of the attributes found in this type of trace. For this trace,
timed_statistics was set
true before the trace was initiated. Starting at the top of the trace, the "
PARSING IN CURSOR" section shows the parsing of a new cursor and the assignment of the "#1" cursor number to it. The tokens listed in Table 3.2 describe information about the cursor and its context.
This section is immediately followed by the actual statement itself, which is terminated by the "
END OF STMT" declaration. Immediately below this are the statistics for the
PARSE call itself. This is the format taken for all subsequent tokens: <ACTION> <CURSOR#>: <data>. In this case, the
PARSE directive reports back with the standard information used by
tkprof as shown in Table 3.3.
The
tkprof output for this statement would look like this:
********************************************************************************
some_table where some_col=:b1
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.08 0.10 2 29 1 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.01 0.01 7 7 3 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.09 0.11 9 36 4 1
Misses in library cache during parse: 1
|
By reading through the output in the raw tracefile, it can be seen how these standard statistics are used by
tkprof. The count column equates to a count of the number of calls made during the lifetime of this statement. In this example, two fetch calls were made, but only one each of parse and execute. The remaining columns are totaled from all calls of that type.
The next component of the raw tracefile is the BINDS section, activated by setting bit 3 when specifying the trace level. In this section, full details of the bind variables used for this execution of the statement are provided. The initial BINDS #<cursor> token is immediately followed by an indented flow of all bind variables specified. The binds are always shown starting with "bind 0," regardless of the naming of the bind variables in the statement. The order in which the binds are listed is the order in which they appear in the SQL statement: top to bottom.
Within the BINDS section, several details are given for each bind variable, although in practice it is the
dty and
value fields with which we are mostly concerned. The
dty field shows the datatype of the bind variable, where 1=
VARCHAR2, 2=
NUMBER, etc.: see the
OCI Programmer's Guide for details. The
value field shows the actual value bound to that variable. If the
value field is absent, this means that no value was bound (
null value) or that the variable is an output variable.
The next section in the raw tracefile is the EXEC call. It is in this execution step of a SQL statement that Oracle will build the result set for a cursor in the case of a query, or actually execute the INSERT, UPDATE, or PL/SQL that the cursor contains. The fields available here are the same as those described during the PARSE phase.
During the execution of this statement, the session passed through several wait states before the result set was built. These wait states can be serially detailed in the tracefile by initiating the trace with bit 4 set in the level (level 8 or 12). This is one of the most useful features of this type of trace, and the detail of the WAIT lines will be used later in the book. Within the WAIT line, there are some new tokens, as shown in Table 3.4.
This information is the same as that found in
V$SESSION_WAIT, with the three parameter fields having different meanings depending on the actual wait state. The labels for the
p1, p2, and
p3 fields can be derived by querying
SYS.X$KSLED, giving the name of the wait as the value for
KSLEDNAM.
The next entry in the raw tracefile is the
FETCH call. In this case, the call is made twice, with the second call returning no data (r=0). This is fairly common, because applications often retrieve data from a cursor until the cursor returns "no more data."
The final area of interest in the tracefile is the
STAT section. There is one of these for each SQL cursor in the tracefile, and it details the actual access plan used for the statement, including the number of rows inspected in order to satisfy the request. One interesting thing to point out here is that it looks as if the version of Oracle that I used to create this tracefile has a bug, in that the
STAT section is referring to cursor #0, which does not exist. The
STAT section should be referring to cursor #1, as the "
TABLE ACCESS FULL SOME_TABLE" operation shows.
The Oracle Call Interface (OCI) Fast Track
The impenetrable
OCI Programmer's Guide makes OCI look like a time-intensive, difficult-to-understand environment. However, this really isn't the case, and it makes good sense, particularly when trying to build an application simulator, to develop some knowledge of OCI. The steps involved in OCI programming relate directly to the required calls from a client to the server in order to service SQL requests, and it is useful to be familiar with them.
The good news is that there is nothing at all difficult about OCI. In fact, if we ignore the fancy things such as object support, then a reasonable familiarity should only take a few pages. One of the most useful aids to learning OCI is an OCI flowchart, showing the required steps for a given statement (see Figure 3.2).
One thing to note from the flow diagram, and indeed from all the code examples in the book, is that the release 7.x OCI concepts are used. The reason for this is that OCI changed substantially in Oracle8, not least to provide object support. The effect of the changes is that the native OCI supplied with Oracle8 is nearly 100 percent different from the release 7.x OCI and adds some complexity to previously simplistic programs. The 7.x concepts lend themselves better to simplistic explanation, and all the concepts and functions are still available in the 8.x libraries for backward compatibility. Finally, retaining the old call API allows the code to be used against Oracle7 instances.
Starting at the top of the flowchart and working down, the first item is initialization of the environment. All that this means is that a structure of type Lda_Def (Login Data Area) and a corresponding HDA (Host Data Area) need to be allocated. The Lda_Def definition is found in a header file supplied by Oracle and is stored in
$ORACLE_HOME/rdbms/demo; the HDA is simply an array of 256 bytes. Next, a connection needs to be made to the Oracle database, using
olog(). This connects the client session to the database and populates the Lda_Def and HDA structures with context information for the connection.
The
oopen() call is used to initialize a cursor for the SQL statement and requires a structure of type Cda_Def to be supplied in order to maintain the cursor context. At this stage, no calls have been supplied with any SQL statements to process. The only function that takes the actual SQL request as one of its arguments is that of
oparse(). This parses the request on the server, ensuring that the statement is syntactically correct, that the submitting user has permission to execute that request, and that all required data dictionary work is complete. This stage also enters the request into the shared pool for later reuse by this or other users. The actual server-side parse activity can be deferred until the describe stage, with the
oparse() call returning immediately without any server interaction.
If the statement has bind variables, a local address must be assigned to each bind variable so that the OCI library knows where to look for values when executing the statement. After this, if the request is a query, the SELECT list must be expanded into full detail by issuing the
odesc() call. This function should be called repeatedly until all items in the SELECT list have been described. The information returned from this call is the name of the column and the details of datatype, size, and so on, which should be used to allocate sufficient memory for the returned data.
At this stage, it is possible to execute the cursor, although it is more logical to call the
odefin() function at this point. The
odefin() call is used to inform the OCI library of the addresses in memory that have been allocated for the return of the data, as defined in the describe stage.
Once the storage locations have been defined, the cursor can be executed using
oexec(). If the cursor is anything other than a query, this is the final step of the execution. If the cursor is a query, the execute stage will execute the query and build the result set but will not return any data until the next step.
The final step for a query-based cursor is to fetch the data back to the client using
ofen(). The
ofen() call allows array fetches to be performed, and should always be used in preference to single-row fetches where more than one row is to be returned. The fetch routine should be called repeatedly until no more data is needed.
At this point, regardless of the type of cursor (SELECT, INSERT, UPDATE, or DELETE), the cursor can either be closed, freeing up the cursor on the server, or be reexecuted using the same or different bind variables. The statement does not need to be rebound to achieve this: the contents of the addresses already bound to the variables can simply be updated with the new values.
That's it. In a nutshell, that is all there is to OCI from the release 7.x perspective. There is a good deal more complexity available in OCI, but the information above covers 90 percent of all the occasions for which you would use it. The important thing is to resist being intimidated by the number of parameters required for each function call in the OCI library. At the end of the day, most of these require a 0, -1, or null pointer, because they are there for compatibility with the
Cobol version of OCI. The best way to simplify your life if you expect to be doing a lot of OCI coding is to make a set of macros around the calls most used in order to fill out the redundant fields automatically.
3.5.4 Using Tracefiles to Generate a Simulation:
An Introduction to
dbaman
Once you have a passing familiarity with OCI, the contents of a SQL tracefile start to look very familiar. In fact, the tracefile is an accurate journal of the work performed by the server and can be mapped directly to a corresponding set of OCI calls. This allows a tracefile to be used as the source of a kind of playback utility, one that reads the tracefile and reapplies the same requests to the server in the same order and with the same parameters. One way to do this is to adopt a two-stage process involving some custom software and an awk script (see Figure 3.3).
This is the approach that will be presented over the next few pages, starting with the
dbaman component of the process.
dbaman is a utility that has many uses. It was originally written to enable OCI programs to be written as interpreted scripts, the idea being that it could then be used as a general-purpose development language for administration of the database, hence the name dbaman[ager]. However, it soon became clear that this tool could be used for a series of other, more specific tasks with only slight modification. The one we are primarily interested in here is tracefile replay utility.
dbaman is an extension of Tcl (Tool command language). Tcl, pronounced "tickle," was written by John Ousterhout at the University of California at Berkeley. It was designed to be an extensible scripting language-that is, a language that can be extended in functionality by the user. It is the result of Ousterhout becoming tired of developing new languages every time a new specific command set requirement came up. He therefore developed Tcl as a core scripting language that he could customize each time a new requirement appeared. This reduces the coding effort and eliminates the need to learn a new language each time.
Tcl is freely available over the Internet. The primary location for obtaining Tcl is http://www.scriptics.com, which is the commercial company set up by John Ousterhout to bring Tcl into the corporate mainstream. It offers full development environments for Tcl and professional support, although Tcl itself remains a public domain product. For simplicity, version 8.0 of Tcl is included on the accompanying CD.
When the Tcl software is built, the result is a library,
libtcl.a. If a
main() routine is linked in with the library, the result is the Tcl shell (tclsh). This is the Tcl equivalent of the Korn shell, on which Tcl programs (scripts) are executed. The scripting language provided by Tcl is easy to learn, flexible, and fast, making it suitable for many development tasks.
The Tcl library provides functionality that enables additional commands to be easily added to the language. It is this functionality that makes Tcl the powerful tool that it is, and is the reason that it was adopted for
dbaman.
With all the extensions linked in, the result is a single Tcl shell language environment with additional capabilities, such as the ability to maintain an Oracle connection and submit requests to the database server from the shell, as can be seen in the following list of available routines within
dbaman:
The example above shows all the functions that the
dbaman extension externalizes to the Tcl interpreter. Used together, these additional commands allow the full playback of tracefiles against the database server. Table
Table 3.5 details each of the commands and their usages.
The only real complexity in this process is in the binds section. First, in order to build usable data structures containing the bind information for a given cursor, the
dbaman code needs to make a half-hearted attempt at parsing the SQL text to determine which bind variables need to be bound or not bound. Second, although the VARCHAR2 datatype can be used for just about everything, the DATE datatype is the exception to that rule (as ever). Dates must be bound by position using the
oci_bind_pos -date operation. This directive instructs
dbaman to convert the date string into a native Oracle format.
Preprocessing and Executing the Tracefile
Once the
dbaman executable is built, some method of generating scripts from the raw tracefile is needed. This is done using an awk script to convert the trace information into the corresponding
dbaman/OCI commands. This is imaginatively named
tclconv.awk.
This conversion script parses the tracefile and produces code that can be directly executed by
dbaman:
oci_parse 256 { alter session set events `10046 trace name context forever, level 4'}
oci_parse 1 { alter session set nls_language= `AMERICAN' nls_territory= `AMERICA' nls_currency= `$' nls_iso_currency= `AMERICA' nls_numeric_characters= `.,' nls_date_format= `DD-MON-YY' nls_date_language= `AMERICAN' nls_sort= `BINARY' nls_calendar= `GREGORIAN' }
# alter session set nls_language= `AMERICAN' nls_territory= `AMERICA' n ...
oci_parse 2 { SELECT EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO,
ROWID FROM EMP WHERE (ENAME LIKE :1) }
# SELECT EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO,ROWID FROM EMP WH ...
# SELECT EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO,ROWID FROM EMP WH ...
|
This example shows the output produced from the first 89 lines of a SQL tracefile. The trace was taken from a SQL*Forms 4.5 program that queries the EMP and BONUS tables, and is intended to serve as a simplified example of the simulation process. In this section, we connect, set our NLS preferences (generated automatically by SQL*Forms), and then run a query against EMP. A fetch of a single row is attempted, which returns no data. The cursor is then canceled, rebound, and then executed once again, this time returning data. Running this script with
dbaman, we can go on to interrogate the status afterwards:
1,status 2,0 2,status 2,rpc 2,header
% puts $results(1,status)
% puts $results(2,status)
Cursor #1 - Text : alter session set nls_language= 'AMERICAN' nls_territory= 'AMERICA' nls_currency= '$' nls_iso_currency= 'AMERICA' nls_numeric_characters= '.,' nls_date_format= 'DD-MON-YY' nls_date_language= 'AMERICAN' nls_sort= 'BINARY' nls_calendar= 'GREGORIAN'
Cursor #2 - Text : SELECT EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO,
ROWID FROM EMP WHERE (ENAME LIKE :1)
% puts $results(2,header)
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO ROWID
7369 SMITH CLERK 7902 17-DEC-80 800 {} 20 AAABLnAACAAACoQAAA
|
It can be seen that two cursors are open at the end of the script: the NLS cursor, and the EMP cursor. Because it was an
ALTER SESSION call, the NLS cursor (cursor 1) was taken only to the EXEC stage. Therefore, it has only a
$results(1,status) variable indicating its status; no data is returned from this cursor. The cursor is not closed in the script, and so the full state of the cursor is preserved within
dbaman.
The EMP cursor (cursor 2) has produced many more variables in the
$results array. The status of the cursor execution shows that it returned successfully from the FETCH. The
oci_fetch call is the only call that will set the status variable to anything more useful than
OK or
NOT OK. The
oci_fetch call will set the variable to "
no data found," if the fetch returns no data.
The data returned from cursor 2 is put into the variable
$results(2,0). One NULL, as denoted by the
{} in the output list, was returned. One row was returned, as shown by
$results(2,rpc), and the column names are all stored in the
$results(2,header) variable. Example output from the
oci_list command can also be seen, showing the SQL for each of the currently open cursors.
Using this example code fragment, we can turn this into a single user simulation using standard Tcl coding:
We now have a very simplistic single-user simulation script. In this case, it repeatedly performs the same operations, although in reality you would want to vary the actual work performed each time. This is a straightforward change to make, with this kind of a construct replacing the
oci_bind_pos 2 0 "
SMI%":
switch [ expr $max_iter % 5 ] {
0 { oci_bind_pos 2 0 "JON%" }
1 { oci_bind_pos 2 0 "SMI%" }
2 { oci_bind_pos 2 0 "MIG%" }
3 { oci_bind_pos 2 0 "ALB%" }
4 { oci_bind_pos 2 0 "WHI%" }
|
Again, this is still very simplistic, but demonstrates the kind of constructs that should be built into the final script. For some of the more critical data points in the simulation, you will need some kind of more generic way to supply data to the script, particularly when simulating more than one user at a time. This will be discussed in Section 6.2.
3.5.5 Validate Server-Side System Utilization
Once each single transaction is complete, run it through on your test system and compare it with a real user from the perspective of system utilization, including CPU, memory, and disk I/O. If you have done a good job, it should be possible to get the system utilization pretty close to that of a real user.
One of the checks that you should perform is to trace the simulated session in the same way that the original tracefile was produced and compare it with the original tracefile. This is the reason why the
tclconv.awk script turns on a trace as the very first thing it executes. While there will almost certainly be differences in the number of recursive calls within the respective sessions, running the before and after tracefiles through the
tkprof facility will quickly split these out and report back accordingly.
Tests performed on the
tclconv.awk/dbaman combination have thus far yielded very good results as far as accuracy is concerned. Using a transaction from a very complex Forms application, the following tkprof summaries are created with the original Form and the simulation session, respectively:
OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 80 0.00 0.00 0 2 0 0
Execute 147 0.00 0.00 1 157 65 83
Fetch 162 0.00 0.00 0 1436 219 324
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 389 0.00 0.00 1 1595 284 407
Misses in library cache during parse: 0
OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 82 0.00 0.00 0 0 0 0
Execute 149 0.00 0.00 0 15 68 35
Fetch 163 0.00 0.00 7 1913 220 326
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 394 0.00 0.00 7 1928 288 361
Misses in library cache during parse: 0
Generated by 'TKPROF SYS=NO'
|
The
tkprof summaries show some differences, mostly in the execute/rows cell, but otherwise all looks good. One of the reasons that this value is different is a shortcoming in
dbaman: the inability to simulate array inserts. This results more from the shortcomings of the trace facility, because the bind variables are dumped only for one of the rows to be inserted.
More information about the accuracy can be gained by querying
v$sesstat for the respective sessions and determining the differences. For a session similar to the one above, the statistics in Table 3.6 were generated.
Some of these differences could be associated with different activities going on in the database at that point in time, particularly the cluster-related items: these cluster statistics are attributed to recursive work performed by the session. However, there are large differences in other areas also:
·
SQL*Net. Clearly, the
dbaman tool was a good deal more verbose over SQL*Net than the original Forms session. This could be attributed mostly to the use of a different API (application programming interface) into the database server. Forms uses the proprietary UPI layer, whereas
dbaman has to use OCI.
·
Memory utilization. Again,
dbaman used a good deal (30 to 53 percent) more server-side memory than the Forms session. This could also be attributed in part to the use of a different API. In addition, any variations in the other statistics could radically change this ratio.
·
Amount of data read. The
dbaman session shows that it returned more data than the Forms session. Once again, the API could come into play here, but it is most likely to be affected by differing amounts of
recursive work performed by each session. Although the
dbaman session made more recursive calls, this does not mean that it performed as much work.
·
Other. Not all information is explicitly logged in the tracefile. Therefore, some amount of guesswork needs to be done in order to work out, for example, where cursors should be closed.
In summary, however,
dbaman is capable of a reasonably good simulation of the Forms session. Some of the discrepancies may balance out over time (particularly the memory-related ones), while others are already very close on target.
3.5.6 Building a Multiuser Framework
Once we have a selection of accurate single-user simulations, we can start to think about how to convert them into a benchmark suite. In order to do this, it becomes necessary to build a simple control framework for all of the sessions. The framework should allow the following actions to be performed:
· Start specified number of sessions.
· Update active set of users.
· Update "sleep time" multiplier for each transaction type.
· Update logon/logoff rate for simulated sessions.
This kind of a framework is an important method of keeping the simulation scientific. Without this kind of control, the simulation is not controllable for long enough to gain repeatable, dependable results.
The complexity of this kind of framework varies, depending on the scale of the simulation. It can range from a simple shell script that starts up the required number of sessions, with checks that keep the number of connected sessions at the required level, all the way to a full UNIX IPC (Interprocess Communication) implementation using shared memory, semaphores, and message queues (see
Section 3.6).
The extent of engineering required for the framework is dependent on the amount of control needed and the number of users that the simulator will be driving. Once several hundred sessions are trying to do the same thing on the system at the same time, some of the more simplistic techniques no longer scale adequately.
One of the most important portions of any simulation, whether written from scratch using OCI or using a third-party tool (see
Section 3.4), is the portion that provides the simulating sessions with data that
· Is usable by the application, with no greater collisions in data access than would occur in real life
· Does not provide the database server with an artificially high cache hit ratio
· Does not provide the database server with an artificially
low cache hit ratio
· Is plentiful. (It makes no sense to run out of driving data 1 hour into an 8-hour stress test)
This is an area that is unique to each particular application and should not be overlooked: the simulation will not scale sufficiently unless the data needs of the application are satisfied.
For example, assume that we are simulating an application that allows the creation of invoices and the subsequent processing of payments against those invoices. During the invoice creation stage, the users simply type in the part number and the quantity; the details of the line items are automatically filled in by the application. When all the lines are entered, the transaction is committed and an invoice number is generated.
Later on in the life cycle of the invoice, a payment is received against it. This occurs several weeks or months after the original invoice was created. At this stage, the total on the invoice is checked, and the credit is applied.
If we are simulating this application, by whatever means, it would be tempting to have each simulated session enter an invoice using a canned set of line items and then proceed to post the credits against the invoice in order to make the simulation self-perpetuating. Unfortunately, this would result in the dreaded "optimal transaction profile," which spoils the results of many benchmarks.
The transaction becomes optimal for the following reasons:
· The same line items are used each time. This means that the database validation on the part numbers will always be resolved by a small number of cached index leaf blocks and their corresponding table data blocks. This means the blocks will always be hot in the buffer cache, ensuring that the lookups require zero disk I/O.
· The retrieval of the invoice by invoice number is also certain to be in cache, because it has just been committed by the same session a few seconds ago. In real life, the invoice is likely to be somewhat older and therefore more likely to require several disk reads to obtain the data.
It can be seen that a workload with this profile is nothing like real life. It is far more realistic to have two distinct sets of connections, each fulfilling one of the functions of data entry, or credit posting. The data that each uses should not be shared, and must be prebuilt and loaded into the database as in-flight transactions. Any lookup information should be as random as it would be in real transactions. For example, the line item lookup in the example above should be driven from a very much larger list of potential line items, making life a little more realistic for the database server.
Many of the RTE packages provide functionality to drive the sessions with different sets of data, and this kind of functionality is not difficult to build into a custom simulation suite.
3.6.2 User Control Problems
Another problem in trying to scale a simulator is control of the user sessions. In simulating many thousands of users from potentially one driver machine, the system can behave in very peculiar ways, and controlling the users may not be as straightforward as it seems.
As mentioned in
Section 3.5.6, UNIX Interprocess Communication (IPC) facilities are a very good way of providing user control. Comprising shared memory, semaphores, and message queues, IPC provides a flexible way to control the sessions and to supply the simulator with the required data. Figure 3.4 shows one way in which IPC can be used to control the sessions and feed them with data.
In Figure 3.4, all three types of IPC have been used for various reasons. Access to the IPC mechanisms is provided through a common set of libraries. The IPC in this example is used as follows:
· Shared memory is used to supply each session with the data it requires. The shared memory segment is populated by a data generator, which maintains lists of operating data in the segment. In addition, the user sessions can use the shared memory to log statistics about themselves during a run and to return data for use elsewhere.
· Message queues are used as pipes for submitting data to the sessions, in addition to providing control messages such as sleep multipliers.
· Semaphores are used by the sessions themselves for explicit serialization operations, where necessary, and as a global "active list" that ensures that the correct number of processes are active at any one time.
The use of IPC is a good way to provide a global control point for the system, although caution needs to be observed in regard to the way some systems behave when many sessions all use the same instance of the IPC mechanism. For example, I once worked on a benchmark that used UNIX message queues as the mechanism for controlling the active user set. All of the sessions would sleep, blocking on the read of the message queue, until a message was sent to them to start work. The problem here was that, although only one session would receive a message to start work at any one time, all 3,000 of the user sessions would wake up (at kernel priority) and go on the run queue each time a message went on the queue. This prevented the system from scaling to the point where we needed it to go, and so another mechanism had to be found. In this case, the easiest and quickest way to fix the problem and proceed was to define
many messages queues and have a smaller number of users waiting on them. This certainly fixed the problem well
enough for us to proceed with the testing, although we would probably have done it in some other way if the circumstances had been different.
3.6.3 Simpler Methods for Use with
dbaman
Although IPC is very flexible, its use may seem a little complex or even daunting at first. If this is the case, or if time is not available to produce an IPC solution, a more simplistic solution can yield fairly accurate results, albeit with somewhat less flexibility. In this section, we will cover a more simplistic approach that can be integrated into a
dbaman simulation with only simple script and infrastructure changes.
The following section describes what is available in this infrastructure.
The user control process allows one-time setup prior to the execution of the test and a way of terminating the test cleanly, which are considered the bare essentials required during a benchmark. Highly desirable control points such as midflight timing changes, live status messages, and guaranteed active user counts are not provided in this model.
The simulator is provided with data on a per-process basis with pre-prepared datafiles for each process.
If these facilities are adequate, it can be relatively straightforward to build the required infrastructure. First of all, a common start-up script needs to be created, containing all the presets for the run:
Calling this start-up script from a master program that assigns unique IDs to the start-up script, all the scripts can pick up a private datafile from which they read a line for each transaction. If we were to use this to simulate the invoice entry example, we would invoke the simulator as follows:
dbaman startup.tcl ${cnt} inv_entry.tcl user${cnt}/pass${cnt} &
|
This would initiate 1,000 invoice entry users, each trying to connect with a username derived from its ID number and attempting to use a datafile with the same ID number. The datafile would contain the part numbers that the session was to enter and would need to be prerandomized in order to avoid the optimal transaction profile.
The script itself can be modified to take advantage of all the facilities now available:
This script now logs in as the specified user and reads through its private datafile, inputting line items every
$think_time and pausing for
$tx_sleep between each complete invoice. Nominal logging has been provided by the total transaction time being logged after each transaction.
This is just a simple example and does not provide a great deal of control. However, it can be seen that most changes are relatively simple to implement owing to the use of a scripting engine for the simulator.
Once the simulation software is complete, or fully configured if third-party software has been used, it is important to finish the process by making the entire process
shrink-wrapped.
It is likely that, after all the effort you have just expended, you will want to reuse this software on numerous occasions. The first step toward this is to make sure that there is nothing in the benchmark suite that is terribly nonportable. This includes all the usual C coding portability warnings, such as relying on the size of a pointer, but equally includes any reliance on the location of system files, Oracle files, and so on. This is worth thinking about at the start of the development, and all the way through whenever something potentially system-specific is used.
A good finishing touch to all of the portability work is to have one central file that contains all of the lookup information required by the simulation, such as:
This file then serves as a checklist after the installation of the software as to what will need to be configured.
Again, you will probably want to run this software a few times, so make it easy to install. Package the software on a single tape (or a range of tapes for maximum portability), and provide setup and running instructions with the software. Make the directory structure for the simulation intuitive, and keep the content of each directory concise for all the main directories. This would include keeping all the source code, the binaries, the driving data sources, the run statistics, and the log files separate from each other.
When you are developing the simulation, you will become an expert in all the nooks and crannies of the software. It makes sense to use this knowledge to write some start-up scripts for the simulation, because you will have become tired of typing in the same old commands, with their many parameters, over and over again. This is especially useful because during execution of subsequent benchmarks, the start-up process could be executed many times a day.
The benefit of having good, well-documented start-up scripts extends far beyond the laziness element of typing them in all the time, however. Benchmarks frequently run long and late into the night, and having good start-up scripts can make the difference between stupid mistakes and high-quality, measurable benchmark results. In addition, the next time the simulator is installed, it is likely that you will have forgotten many of the idiosyncrasies of the software.
3.7.4 Automating Information Retrieval at Closedown
When a successful benchmark execution has been performed, it is likely that there will be a lot of data in widely distributed places that you will want to keep and log for that test. Among such data are the following:
· Test start-up parameters
· Test notes, including reason for test parameters, changes since last run, and why changes were made
· Response time and throughput statistics from the simulator sessions
· System statistics from all machines involved
· Database statistics from database server
It is very frustrating to forget to retrieve some of this information at test closedown, because it could invalidate the entire test at the analysis stage. Therefore, invest sufficient time in creating information gathering and filing systems. These routines can also be responsible for the clearing down of logging areas at the end of a test phase in order to facilitate the next test session.
3.8 Define Limitations in Advance
3.8.1 A Benchmark Is Never Perfect
As we discussed earlier, it is almost impossible to create a 100 percent accurate simulation of a system. It is possible to get it pretty close, however, and it is important that the proximity to real life be documented. Some of the specific areas that would definitely need to be documented as deficiencies were noted in
Section 3.2.3, but there are many more that will come up in the development of the simulator. It is important to accept that the perfect simulation is not possible, and to make sure that all limitations are clearly defined and made available for use in any results analysis.
3.8.2 Measure the Impact of the Inaccuracies
Assuming that the benchmark is not perfect, how far off the mark are we? What is the impact of being erroneous? Which areas of the system must we get absolutely correct, and which areas allow some room for error?
These are all good questions to answer up front. If the simulation shows that it is using 5 percent less CPU per user than a real user, then this needs to be factored into the analysis, and a risk assessment needs to be produced. If the results show that an eight-processor machine is needed, and the machine will scale well up to 12 or 16 processors, then this is not a critical issue, because the option of adding more processors is available if all else failes.
Throughout this chapter, we have covered a wide variety of topics, ranging from choice of tools to development of a custom tool. This is indicative of a real benchmark, because a successful benchmark depends on diverse skill sets from the people developing and running it.
A complete solution to the benchmark problem has not been presented in this chapter. To a degree this has been intentional, because the most important attribute of all is to
understand every minute aspect of the benchmark operation. The intent of this chapter has been to provide pointers and a starting point for what is a very complex task.
When you are actually executing the benchmark, paranoia is a fine quality. If something doesn't seem quite right, or somebody does something that magically solves a problem, please be suspicious and ask questions. It is easy to have the wool pulled over one's eyes, and there is a good deal of wool involved in most benchmarks.
Most of all, enjoy the process. At the end of the benchmark, you are likely to understand a great deal about all aspects of the system, including the application, the tuning approach required for Oracle, the quirks of the hardware platform-everything.
Aho, Alfred, B. W. Kernighan, and P. J. Weinberger. 1988.
The AWK Programming Language. Reading, MA: Addison-Wesley.
Ousterhout, J. 1994.
Tcl and the Tk Toolkit. Reading, MA: Addison-Wesley.
Various. "Oracle Call Interface Programmer's Guide."
Oracle RDBMS Documentation.
1
Expect can be downloaded for free from http://expect.nist.gov.