About data warehouse : A database consists of one or more files that need to be stored on a computer. In large organizations, databases are typically not stored on the individual computers of employees but in a central system. This central system typically consists of one or more computer servers. A server is a computer system that provides a service over a network. The server is often located in a room with controlled access, so only authorized personnel can get physical access to the server.
In a typical setting, the database files reside on the server, but they can be accessed from many different computers in the organization. As the number and complexity of databases grows, we start referring to them together as a data warehouse.
Here You Will Find All DWDM question papers :
May June 18 Nov Dec 18
More About Data Ware House and Data Mining :
A data warehouse is a collection of databases that work together. A data warehouse makes it possible to integrate data from multiple databases, which can give new insights into the data. The ultimate goal of a database is not just to store data, but to help businesses make decisions based on that data. A data warehouse supports this goal by providing an architecture and tools to systematically organize and understand data from multiple databases.
Distributed DBMS : As databases get larger, it becomes increasingly difficult to keep the entire database in a single physical location. Not only does storage capacity become an issue, there are also security and performance considerations. Consider a company with several offices around the world.
It is possible to create one large, single database at the main office and have all other offices connect to this database. However, every single time an employee needs to work with the database, this employee needs to create a connection over thousands of miles, through numerous network nodes. As long as you are moving relatively small amounts of data around, this does not present a major challenge.
But, what if the database is huge? It is not very efficient to move large amounts of data back and forth over the network. It may be more efficient to have a distributed database. This means that the database consists of multiple, interrelated databases stored at different computer network sites.
To a typical user, the distributed database appears as a centralized database. Behind the scenes, however, parts of that database are located in different places. The typical characteristics of a distributed database management system, or DBMS, are:
- Multiple computer network sites are connected by a communication system
- Data at any site are available to users at other sites
- Data at each site are under control of the DBMS
You have probably used a distributed database without realizing it. For example, you may be using an e-mail account from one of the major service providers. Where exactly do your e-mails reside? Most likely, the company hosting the e-mail service uses several different locations without you knowing it.
The major advantage of distributed databases is that data access and processing is much faster. The major disadvantage is that the database is much more complex to manage. Setting up a distributed database is typically the task of a database administrator with very specialized database skills.
Once all the data is stored and organized in databases, what’s next? Many day-to-day operations are supported by databases. Queries based on SQL, a database programming language, are used to answer basic questions about data. But, as the collection of data grows in a database, the amount of data can easily become overwhelming. How does an organization get the most out of its data without getting lost in the details? That’s where data miningcomes in.