Functions of a DBMS
In this section we look at the types of function and service we would expect a DBMS to
provide. Codd (1982) lists eight services that should be provided by any full-scale DBMS,
and we have added two more that might reasonably be expected to be available.
(1) Data storage, retrieval, and update
A DBMS must furnish users with the ability to store, retrieve, and update data in the
database.
This is the fundamental function of a DBMS. From the discussion in Section 2.1, clearly
in providing this functionality the DBMS should hide the internal physical implementation
details (such as file organization and storage structures) from the user.
(2) A user-accessible catalog
A DBMS must furnish a catalog in which descriptions of data items are stored and
which is accessible to users.
A key feature of the ANSI-SPARC architecture is the recognition of an integrated system
catalog to hold data about the schemas, users, applications, and so on. The catalog is
expected to be accessible to users as well as to the DBMS. A system catalog, or data
dictionary, is a repository of information describing the data in the database: it is, the ‘data
about the data’ or metadata. The amount of information and the way the information is
used vary with the DBMS. Typically, the system catalog stores:
n names, types, and sizes of data items;
n names of relationships;
n integrity constraints on the data;
n names of authorized users who have access to the data;
n the data items that each user can access and the types of access allowed; for example,
insert, update, delete, or read access;
n external, conceptual, and internal schemas and the mappings between the schemas, as
described in Section 2.1.4;
n usage statistics, such as the frequencies of transactions and counts on the number of
accesses made to objects in the database.
The DBMS system catalog is one of the fundamental components of the system. Many of
the software components that we describe in the next section rely on the system catalog
for information. Some benefits of a system catalog are:
n Information about data can be collected and stored centrally. This helps to maintain
control over the data as a resource.
n The meaning of data can be defined, which will help other users understand the purpose
of the data.
n Communication is simplified, since exact meanings are stored. The system catalog may
also identify the user or users who own or access the data.
n Redundancy and inconsistencies can be identified more easily since the data is centralized.
n Changes to the database can be recorded.
n Information about data can be collected and stored centrally. This helps to maintain
control over the data as a resource.
n The meaning of data can be defined, which will help other users understand the purpose
of the data.
n Communication is simplified, since exact meanings are stored. The system catalog may
also identify the user or users who own or access the data.
n Redundancy and inconsistencies can be identified more easily since the data is centralized.
n Changes to the database can be recorded.
n The impact of a change can be determined before it is implemented, since the system
catalog records each data item, all its relationships, and all its users.
n Security can be enforced.
n Integrity can be ensured.
n Audit information can be provided.
Some authors make a distinction between system catalog and data directory, where a data
directory holds information relating to where data is stored and how it is stored. The
International Organization for Standardization (ISO) has adopted a standard for data dictionaries
called Information Resource Dictionary System (IRDS) (ISO, 1990, 1993). IRDS
is a software tool that can be used to control and document an organization’s information
sources. It provides a definition for the tables that comprise the data dictionary and the
operations that can be used to access these tables. We use the term ‘system catalog’ in this
book to refer to all repository information. We discuss other types of statistical information
stored in the system catalog to assist with query optimization in Section 21.4.1.
(3) Transaction support
A DBMS must furnish a mechanism which will ensure either that all the updates
corresponding to a given transaction are made or that none of them is made.
A transaction is a series of actions, carried out by a single user or application program,
which accesses or changes the contents of the database. For example, some simple transactions
for the DreamHome case study might be to add a new member of staff to the database,
to update the salary of a member of staff, or to delete a property from the register.
A more complicated example might be to delete a member of staff from the database and
to reassign the properties that he or she managed to another member of staff. In this case,
there is more than one change to be made to the database. If the transaction fails during
execution, perhaps because of a computer crash, the database will be in an inconsistent
state: some changes will have been made and others not. Consequently, the changes that
have been made will have to be undone to return the database to a consistent state again.
We discuss transaction support in Section 20.1.
(4) Concurrency control services
A DBMS must furnish a mechanism to ensure that the database is updated correctly
when multiple users are updating the database concurrently.
One major objective in using a DBMS is to enable many users to access shared data concurrently.
Concurrent access is relatively easy if all users are only reading data, as there is
no way that they can interfere with one another. However, when two or more users are
accessing the database simultaneously and at least one of them is updating data, there may
be interference that can result in inconsistencies. For example, consider two transactions
T1 and T2, which are executing concurrently as illustrated in Figure 2.7.
T1 is withdrawing £10 from an account (with balance balx) and T2 is depositing £100 into
the same account. If these transactions were executed serially, one after the other with
no interleaving of operations, the final balance would be £190 regardless of which was
performed first. However, in this example transactions T1 and T2 start at nearly the same
time and both read the balance as £100. T2 then increases balx by £100 to £200 and stores
the update in the database. Meanwhile, transaction T1 decrements its copy of balx by £10
to £90 and stores this value in the database, overwriting the previous update and thereby
‘losing’ £100.
The DBMS must ensure that, when multiple users are accessing the database, interference
cannot occur. We discuss this issue fully in Section 20.2.
(5) Recovery services
A DBMS must furnish a mechanism for recovering the database in the event that the
database is damaged in any way.
When discussing transaction support, we mentioned that if the transaction fails then the
database has to be returned to a consistent state. This may be a result of a system crash,
media failure, a hardware or software error causing the DBMS to stop, or it may be the
result of the user detecting an error during the transaction and aborting the transaction
before it completes. In all these cases, the DBMS must provide a mechanism to recover
the database to a consistent state. We discuss database recovery in Section 20.3.
(6) Authorization services
A DBMS must furnish a mechanism to ensure that only authorized users can access the
database.
It is not difficult to envisage instances where we would want to prevent some of the data
stored in the database from being seen by all users. For example, we may want only branch
managers to see salary-related information for staff and prevent all other users from seeing
this data. Additionally, we may want to protect the database from unauthorized access. The
term security refers to the protection of the database against unauthorized access, either
intentional or accidental. We expect the DBMS to provide mechanisms to ensure the data
is secure. We discuss security in Chapter 19.
(7) Support for data communication
A DBMS must be capable of integrating with communication software.
Most users access the database from workstations. Sometimes these workstations are connected
directly to the computer hosting the DBMS. In other cases, the workstations are at
remote locations and communicate with the computer hosting the DBMS over a network.
In either case, the DBMS receives requests as communications messages and responds in
a similar way. All such transmissions are handled by a Data Communication Manager
(DCM). Although the DCM is not part of the DBMS, it is necessary for the DBMS to be
capable of being integrated with a variety of DCMs if the system is to be commercially
viable. Even DBMSs for personal computers should be capable of being run on a local area
network so that one centralized database can be established for users to share, rather than
having a series of disparate databases, one for each user. This does not imply that the
database has to be distributed across the network; rather that users should be able to access
a centralized database from remote locations. We refer to this type of topology as distributed
processing (see Section 22.1.1).
(8) Integrity services
A DBMS must furnish a means to ensure that both the data in the database and changes
to the data follow certain rules.
Database integrity refers to the correctness and consistency of stored data: it can be
considered as another type of database protection. While integrity is related to security, it
has wider implications: integrity is concerned with the quality of data itself. Integrity is
usually expressed in terms of constraints, which are consistency rules that the database
is not permitted to violate. For example, we may want to specify a constraint that no
member of staff can manage more than 100 properties at any one time. Here, we would
want the DBMS to check when we assign a property to a member of staff that this limit
would not be exceeded and to prevent the assignment from occurring if the limit has been
reached.
In addition to these eight services, we could also reasonably expect the following two services
to be provided by a DBMS.
(9) Services to promote data independence
A DBMS must include facilities to support the independence of programs from the
actual structure of the database.
We discussed the concept of data independence in Section 2.1.5. Data independence is
normally achieved through a view or subschema mechanism. Physical data independence
is easier to achieve: there are usually several types of change that can be made to the physical
characteristics of the database without affecting the views. However, complete logical
data independence is more difficult to achieve. The addition of a new entity, attribute, or
relationship can usually be accommodated, but not their removal. In some systems, any
type of change to an existing component in the logical structure is prohibited.
(10) Utility services
A DBMS should provide a set of utility services.
Utility programs help the DBA to administer the database effectively. Some utilities work
at the external level, and consequently can be produced by the DBA. Other utilities work
at the internal level and can be provided only by the DBMS vendor. Examples of utilities
of the latter kind are:
n import facilities, to load the database from flat files, and export facilities, to unload the
database to flat files;
n monitoring facilities, to monitor database usage and operation;
n statistical analysis programs, to examine performance or usage statistics;
n index reorganization facilities, to reorganize indexes and their overflows;
n garbage collection and reallocation, to remove deleted records physically from the
storage devices, to consolidate the space released, and to reallocate it where it is needed.
Reviewed by Shopping Sale on 22:27 Rating: 5

No comments:

Powered by Blogger.