Observation on Coordination of Heterogeneous Databases

时间:2022-08-28 12:49:08

Abstract: How to coordinate heterogeneous?database to realize sharing and consistency of information not only is an important research subject in the database filed, but also is an emerging research hot spot in the field of database system. As a new development direction of computer software, it applies some new techniques to integrate the existing systems to develop new applications. The paper analyzes CSCW, XML, JDBC and heterogeneous?database, and proposes a solution of heterogeneous?database. The solution absorbs the thoughts of database meta data, middle ware and LDAP directory service.

Key words: heterogeneous?database, CSCW, Agent, XML, Middle ware

1. Introduction

With the development of techniques, database faces new challenges. In dynamic environment, not only there are lots of databases, but also new databases join or are removed. And the data in databases is changing. Most existing applications are established on the basis of separated databases. Therefore, it is difficult for applications to realize coordination, which hinders people to share database resources, especially various heterogeneous?database resources. The people are urgent to solve the problem, which means to realize exchange visits and sharing of data between various heterogeneous?databases, and realize transparent?access of data with minimum cost.

The heterogeneity of database mainly?reflects?the?following?aspects.

(1) Heterogeneity of architecture of computer system. Databases run on supercomputer, medium and small computer, workstation, desktop, embedded or handheld equipment.

(2) Heterogeneity of operation system. Operation system operated in database systems includes Unix,Win2000/NT,Win XP and Linux.

(3) Heterogeneity of database structure and semantics. Different database application systems use different data structures and semantic expressions.

(4) Heterogeneity of DBMS. It can be relational database such as Oracle, SQL Server and Sybase, and it can be database with different data models such as relation, model, level, network and object-oriented database.

(5) Heterogeneity of system?control?mode including centralized and distributed mode.

(6) Heterogeneity of network. The network with different types and topological structures connect, which includes LAN, WAN,?Ethernet bus structure and token ring architecture.

(7) Heterogeneity of physical?database?model. The concept pattern is the same, but data structure is different. For example, ORACLE and INFORMIX is relational database, but the structure is different.

When lots of databases are established on LAN, the applications of C/S architecture become the mainstream, and Internet and WWW is becoming the?practical?approach?to achieve information. And the problem of coordination of heterogeneous?databases is more prominent. More and more applications need a heterogeneous data integration system to realize interoperability and data sharing, which means to access various heterogeneous data sources.

2 Framework System of CSCW System

Collaborative?design system is the mixture consisting of designers and intelligent agents which work cooperatively in order to achieve the same objective. When a system is determined, the functions of the above factors are different, and they influence and restrict each other.

Lander summarizes three mainstream?architectures of collaborative?design.

(1) Network structure. In the structure, each main body has its function modules including communication interface, local problem solution, other agent-based models and current context model. It allows changing the composition of main body of collaborative?design system dynamically, and it is open. Besides including domain knowledge, it need to include completed communication and control knowledge, which makes the redundancy of information and knowledge great. Therefore, network structure is suitable for the design environment with small number of main bodies, great system openness and loose subtasks.

(2) Federal structure. In the structure, the connection of main bodies and information transmission uses a special main body called coordinated?controller. It is the nerve center of main body, and it takes charge of information conversion of collaborative design groups and main bodies, plan, decomposition and management of tasks. When a main body needs service, it only needs to send request to coordinated controller without taking effect with other main bodies. Therefore, federal structure can flexibly implement different communication protocols, and it is applicable to the systems with subtasks and frequent information conversion.

(3) Agent-oriented blackboard model. It is similar to federal structure. The interaction of main bodies is managed in groups. The difference between them is that agent-oriented blackboard structure divides cooperative management of the system into several components. In each local agent group, there is a shared data memory which is called blackboard to store design data and design process information. Physical communication between agents is realized by network manager, which reduces the burden of coordinated controller.

2.1 Application of federal structure to heterogeneous?database

Federal structure of CSCW can be used to solve heterogeneous?database. When the applications need to access a database, the applications need to send request to coordinated controller firstly without directly taking effect with the accessed database. Coordinated controller takes charge of connection and information transmission between databases. It is a special agent and is the nerve center of databases. It takes charge of information transmission of cooperative database, and plan, decomposition and management of tasks.

In the environment supported by the computer (CS), a group works cooperatively to complete a common task---coordination of heterogeneous?database, which is a complicated process and can simulate the mode of people working collaboratively. The feasible method is that the task operating single database is used as the basic run unit and control unit of coordination of heterogeneous?database. Several tasks use federal structure for collaboration to overcome conflicts, which achieves the objective of working collaboratively. The tasks can be divided into several subtasks, or delaminating thoughts of application task. And the task has its task context.

2.2 Applying Agent-based three-layer cooperation model to realize coordinated controller

The paper applies an Agent-based three-layer cooperation model [8] to realize coordinated controller, which conforms to the characteristic of people working collaboratively. It is a control mechanism of model operation, and it can describe various collaborative works of cooperative mechanism and supports various collaborative ways.

The paper regards autonomous entity of coordinated controller in collaborative?process of CSCW as Agent, and regards collaborative?process of CSCW as active sequences of several Agents. The collaborative process of CSCW application system is described by describing the action, behavior and relationship of Agent, which conforms to the characteristic of people working collaboratively in realistic world.

The cooperation model based on Agent is used to realize coordinated controller. The core is to componentize the management and monitor of task to realize cooperation of component-based tasks between heterogeneous?database systems. And it has the functions of task assignment, task validation and task?query.

The cooperation mode uses three-layer cooperative Agent to dynamically construct CSCW application system, and it has the following advantages.

(1)Each Agent completes the task collaboratively. The intelligence and autonomy of Agent, and coordination strategy between Agents hide the complexity of cooperating tasks.

(2)When different users need to cooperate, Agent can play the role of coordination or communication, which avoids delay factors in realistic cooperation, and improves the work efficiency of cooperation.

(3) When the users work collaboratively under the same environment, after Agent knows the interest and habit of users, it can replace the users to effectively perform the task. And each Agent can effectively manage and monitor events and process of cooperative work, which reduces the burden of people in cooperative work.

(4) Using three-layer Agent effectively solves hierarchical coordination and management and dynamic collaboration of tasks in CSCW system. Three-layer Agent cooperates mutually. When there are problems in cooperative work, each Agent can respond timely according to the knowledge base and coordination strategy.

3 Map of XML Document and Relational Database

The data conversion of heterogeneous databases can use a unified data conversion format. XML has its characteristics and advantages, and it is easy to express various types of data, so it can be used as the middleware for data conversion between heterogeneous databases, which not only can solve the problem of uniform interface of data, but also is easy for data conversion between heterogeneous databases.

The essence of implementing bilateral switching of XML document and RDBMS [11] is the map of XML document and relational database. The mapping relation is divided into template drive and model drive.

3.1Template-driven map

The conversion method based on template is not to predefine the mapping relation between XML document and other data, but to embed some executable instructions in XML document. The instructions can be recognized and executed by the system in the process of conversion. The execution results are replaced in the place of the instructions, which generates objective XML document.

The paper takes database data as an example. In order to achieve flight information in database, the flight information is represented by XML document. And we can define the following template.

The following flights have available seats:< /Intro>

SELECT AirLine , F1tNumber, Depart, Arrive From Flights

< /SelectStmt>

We hope one of these meets your needs

When XML document is to be created, the system scans the template. When there is instruction, the system recognizes that it is an executable instruction, and executive program?of instructions is called to execute the instructions. The results after executing instructions are as follows.

The following flights have available seats:

ACME

123 < /FltNumber>

Dec12,1998,13:43

Dec13,1998,01:21

……

We hope one of these meets your needs

The advantage of conversion method based on template is that the step is easy and the application is flexible. Only giving the template can rapidly generate XML document. In addition, it supports programming structure such as loop structure and judgment structure, and it can transmit parameters through HTTP. The disadvantage is that it only can convert the other types of data into XML document. The key of conversion method based on template is to generate lots of rational templates, for which the system needs to provide a set of tools generating templates and instruction execution programs for the users. The database data can use database management system to generate instruction execution program.

3.2 Model-driven mapping approach

The conversion method based on model is to use the predefined data model to map the relationship between hierarchical structure of XML document and the data structure of other formats. The common data models include table and object, which generates Table-based Mapping and Object-based Mapping.

Table-based Mapping understands XML documents with specific structures as a table, which directly corresponds to the table in database. For database, the easiest model can define document structure as follows.

...< /column1>

...

……

……

……

When the data of database is converted to be XML document, only the data of a table or a query result is inserted into XML document. However, when XML document is converted into the data of database, only the content is inserted into the table.

The most?prominent advantage of table-based mapping is simplicity, which makes it suitable for data conversion between two databases. But the limitation of the conversion is that it is only applicable for small-scale XML document without retaining physical structure of document (such as character and entity reference, CDATA or character code), document information (such as document type or DTD), annotation?information and processing?instruction. If XML data document is inconsistent with the above format, table-based model can’t be used.

Object-based mapping firstly maps XML document to several object trees with hierarchical structure (DOM), and maps the objects into object-oriented database or relational database through object relationship. According to rules, the hierarchical structure of documents is converted into tree structure. The model is convenient for the conversion of XML document and object-oriented database and hierarchical database.

The conversion with relational database can be realized by using traditional object-relation mapping technique. For example,

object table{

order co11 =”order”

customer< /col2> co12= “customer”

line co13=”line”}

4 Accessing Database through JDBC

Java program can directly connect with database and execute specified database by using JDBC API. JDBC API is described to be aggregated and abstract Java interface. The interface applications can be used to open the connection of a database, execute SQL statement and process results. The most important interfaces include

Java.sql.DriverManager. It processes load of driver and establishing new database connection.

Java.sql.Connectio. It represents connection with specific database. Java.sql.Statemet. It represents that a specific container executes SQL statement on a specific database.

Java.sql.ResultSet. It controls the access of a specific statement.

And Java.sql.Statement has two subclasses, as follows.

Java.sql.PreparedStatemen. It executes precompiled SQL statement. Java.sql.CallableStatement. It is used to execute the call of nested process in a database.

JDBC Driver API facing bottom layer is applied for database?vendors to develop low-level driver of database. It is general that developing applications doesn’t need the class library. Java application operates database by a series of abstract classes defined in SQL. The abstract classes can’t complete the actual operations, but is operated by database driver.

The drivers of JDBC can be divided into four types.

(1)JDBC-ODBC bridge

JDBC uses JDBC-ODBC bridge to access database through ODBC. If JDBC-ODBC bridge is used to access drivers, ODBC driver must be installed in the computer. And the bridge driver is suitable for network in companies.

(2) Native- API partly Java driver

The driver converts JDBC into SQL Sever,Sybase and Oracle, or call of client API of other DBMS. Like bridge driver, the driver requires native machine code in each client.

(3) JDBC-Net pure Java driver

The driver converts JDBC call to be an independent network protocol of DBMS. The protocol is converted to be DBMS protocol by network server. The network server is a middleware, and it can connect pure Java client with different types of databases. The developers providing solutions may provide products for internet/intranet application. To make the products support the access of Internet database, the requirements of safety and accessing database through firewall must be met.

(4) Native-protocol pure Java driver

The driver converts JDBC call into network protocol which can be directly used by DBMS, which allows client directly requesting to call DBMS server. It is a practical solution to realize accessing database distributed on Internet/Intranet. Because many network protocols are exclusive, the providers of the driver should be database developers.

For the above drivers, the efficiency of pure Java driver is the greatest. But the first and second kind of driver is easy to be achieved and is used commonly.

JDBC can support two types of different models. According to the relationship level?of user and database, the models are called two-layer model and three-layer model.

In two-layer model, Java application (Applet or Application) directly connects with database. SQL statement of the user is submitted to database, and the executed results return to the users. The database can be in a computer to be connected with the user by network. The model has client/server structure. The subscriber computer is like a client, and the computer storing database is like a server. No matter local area network or wide area network can connect them.

In three-layer model, the user connects with database indirectly. The instruction is firstly sent to the middle tier of server. Then, the middle tier sends SQL statement to database. The database executes SQL statement and sends the processing results to the middle tier, and the middle tier sends the results to the users. The advantage of three-layer model is to control access right and public data, and it is easy to be managed. The user can use friendly high-level APT, and the middle tier is used to convert it into low-level instructions, which not only ensures good running efficacy, but also has the advantage of performances. The middle tier is generally written with C or C++ language. With the promotion of Java language, there are more and more middle tiers written with Java, which can make better use of the characteristics of robustness, multi-threading, safety and platform independence.

5 Conclusion

The paper introduces integration problems of heterogeneous databases including classification of heterogeneous databases and existing methods of heterogeneous databases realizing data access. The paper introduces CSCW. The federal structure is used to solve the problems of heterogeneous databases. And three-layer cooperation model based on Agent is used to realize coordinated controller. The paper studies two types of maps for data conversion of heterogeneous databases, template-driven mapping and model-driven mapping. And the data conversion of heterogeneous databases uses unified intermediate format――XML. After expounding JDB and heterogeneous database, the paper proposes a solution for heterogeneous database. The solution absorbs the ideas of database metadata, middleware and LDAP directory service, and applies federal structure in CSCW architecture. For the input requirements in applications, the coordinated controller is used to call database metadata and directory service of databases under the support of communication processor. Collaborative query, transaction management, integrity constraint module and accessing low-level database can shield heterogeneity. With the development of middleware technique of heterogeneous databases, how to support interoperability of heterogeneous databases, how to model source data and user query, how to model and calculate data sources with constrained query capability and how to generate query plan and inquiry needs to be improved and perfected. Providing various application services by the way of middleware is an important subject of the existing researches and applications. The objective is to shield low-level complicated heterogeneous system, which not only provides unified service interface for high layer, but also improves utilization ratio and sharing degree of storing resources.

Reference

[1]Chen Tianhuang, Zou Qingmei, Research on information sharing technique of heterogeneous?databases based on XML, Journal of Wuhan university of technology (Traffic science and engineering), January 2005, 29(1), 129-130.

[2]Zhang Shuiping, Wan Yinghui, Lu Xiao, Integration and interoperation of heterogeneous?databases, Computer Application Research, 1999(1):81-83.

[3]Tang Wei, Zhou Junlin, Li Xiao, Observation on integration methods of heterogeneous?databases, Computer Application Research, 1999(8): 64-67.

[4] Robert Signore,John Creamer,Michael O Stegman, Hou Xueping, Interacted ODBC scheme of open database, Beijing: Publishing House of Electronics Industry, 1995.

[5] Zheng Qinghua, A modeling & implementation method of CSCW, Chinese J. Computers, 1998 ,21 (8) :270 - 275.

[6]Shi Meilin, Xiang Yong, Yang Guangxin, Theory and application of cooperative work supported by computer, Beijing: Publishing House of Electronics Industry, 2000.

[7] Du Laihong, Chen Hua, Fang Yadong, Research on CSCW system of enterprise [J], Journal of Shaanxi University of Technology, 2004.12, 22(6), 75~78.

[8] Kreifelts, T. , Pankoke- Babatz, V. , Victor, F. , A Model for the Coordination of Cooperative Activities[C]. Proceedings of the International Work shop on CSCW. Berlin: 1991. :85-100.

[9] Tim Bray. W3C. Extensible Markup Language (XML) 1.0: second Editor [EB/OL]., 2000-10-06.

[10] Michael Morrison,Application of XML[M], Beijing: Press of Qinghua University, 2000.

上一篇:追逐龙卷风的勇士们 下一篇:舌尖上即将消失的物种