Database Diskspace Required for a Data Warehouse

Filed under: General — admin at 3:01 pm on Thursday, May 29, 2008

The ad hoc demands placed on the data warehouse will often require the use of temporary tables to store interim results. Space will need to be set aside for this purpose. That space needs to be efficiently managed. The consequence of not managing it efficiently is that either the space required will be many times larger than necessary, or everything will grind to a halt. lf the system and database management tool allows this space to be efficiently monitored, and allocated space to be dated and tracked, the space can be kept to a minimum, saving many tens of gigabytes of disk space.

As ever, it is the size and complexity of the data warehouse that is the issue here. Large numbers of error log and trace files will be produced. These need to be regularly and automatically scanned for errors or problems. This is one of the functions of the event manager. These files also need to be managed, with old log and trace files being purged to save space and make it easier to deal with the current logs.

Logs that contain significant data should be archived. Logs such as the database log, which contains all major changes, should be kept to maintain a history. This could prove to be important in a crisis situation, where the thread of events over the last number of weeks or months can become important. Any log files that contain errors that have not yet been explained should also be kept until the cause of the error is understood. The tool should allow such indicated log files to be archived, while others are deleted.

How Much Memory does a Database require?

Filed under: General — admin at 11:58 pm on Monday, May 19, 2008

Memory is a commodity that you can never have enough of. What you need to estimate is the minimum requirement. There are a number of things that affect the amount of memory required. First, there are the database requirements. The database will need memory to cache data blocks as they are used; it will also need memory to cache parsed SQI statements and so on. These requirements will vary from RDBMS to RDBMS, an, you will need to work them out for whatever software you are using.

You will also need memory for sort space. Each process that performs a sort will require an amount of sort area. This is usually parameterized, and can be specified by you. Remember that when you use parallel technology you may need to allow enough sort area for each sorting process. This can add up, particularly if you have multiple concurrent queries running.

Secondly, each user connected to the system will use an amount of memory how much will depend on how they are connected to the system and what software they are running. As it is likely that the users will be connected in a client-server mode, the user memory requirement may be quite small. Finally, the operating system will require an amount of memory. This will vary with the operating system and the features and tools you are running. You can get the hardware vendor to estimate how much memory the system will use.