The Data Files
Primary Data File - This is the default file that is created when you create your database. This is where all your database objects will reside unless you add additional data files. The primary data file usually has the file extension .mdf (SQL server doesn't force you to use the extension but its standard practice). There is one primary data file for every database you create.
Transaction log - The transaction log does just what it says, it logs transactions. All data modifications are written to the transaction log before being written to the primary (or secondary) data file. This is what is commonly referred to as "write-ahead caching". I'll go more into how data is written to disk a little further down. Remember, you must have at least one Transaction Log file for each database. The transaction log file usually has the file extension .ldf
Secondary Data Files - Secondary data files can be created to supplement the data storage space of the primary data file. These data files are optional and usually have the file extension .ndf. You can use secondary data files for various reasons. For example, you could use them to stripe you data across multiple disks to speed up data access or to physically separate different database schemas into separate files.
Writing Data to Disk
We talked about the transaction log and how it uses write-ahead caching, well here is the process for writing data to disk. First the data pages are loaded into memory (unless they are already there from some previous activity). Then a record is created in the log cache. Once the transaction has been committed it is written to the transaction log from the log cache. Transactions are then written to primary or secondary data file when a checkpoint occurs. The Checkpoint process occurs for a variety of reasons.
- If you issue the CHECKPOINT statement
- If you stop the SQL Server Process
- If you backup the Database or do a Transaction Log backup
- If you have the database in log-truncate mode and the log becomes 70% full
- If you perform a minimally logged operation
- Based on the "recovery interval" server configuration option.
You can group your database files together into filegroups to aid with administration or allocation. There are two types of file groups: Primary and User-defined. The Primary file group contains the primary data file (.mdf) and any other files not assigned to a user-defined file group. It also contains the system database and tables. User-defined filegroups are just what they sound like, a filegroup created by the user. Log files are not associated with a filegroup. You have to add files to a filegroup when they are created. You can't move files around after they have been added to the database. Filegroups are handy because they can be backed up or restored separately and even be taken offline and brought back online separately. An example of when you could use them is if you had some tables that were read from more often (like archive data) you could put those tables into a secondary data file that you put into a user-defined filegroup and then move that filegroup to a separate disk to increase performance.
Hopefully I have given you a good overview of Database Files and filegroups. There is plenty more to learn so don't be afraid to use google to dig a bit deeper into the subject. Next time I will be discussing Creating and Modifying Databases.