See what data is in what SQL Server data file?
Categories:
Unveiling Data Distribution: How to See What's in Your SQL Server Data Files

Discover methods to identify which data resides in specific SQL Server data files (.mdf, .ndf). This article covers using system views and T-SQL queries to gain insight into file-level data allocation, crucial for performance tuning and storage management.
Understanding how data is distributed across your SQL Server data files is a fundamental aspect of database administration. Whether you're troubleshooting performance bottlenecks, planning for storage growth, or optimizing I/O, knowing which tables and indexes occupy specific files or filegroups can provide invaluable insights. This article will guide you through various T-SQL techniques to inspect the contents of your SQL Server data files, focusing on identifying the objects (tables, indexes) that consume space within them.
Understanding SQL Server File Architecture
Before diving into queries, it's important to grasp the basics of SQL Server file architecture. A SQL Server database consists of one or more data files and at least one transaction log file. Data files typically have .mdf
(primary data file) or .ndf
(secondary data file) extensions. These files are organized into filegroups, which are logical containers for data files. Objects like tables and indexes can be assigned to specific filegroups, allowing for better management and performance optimization, especially across different storage tiers.
flowchart TD DB[SQL Server Database] --> FG1(Filegroup 1) DB --> FG2(Filegroup 2) FG1 --> MDF1[Primary Data File (.mdf)] FG1 --> NDF1[Secondary Data File 1 (.ndf)] FG2 --> NDF2[Secondary Data File 2 (.ndf)] MDF1 --> T1(Table A) MDF1 --> I1(Index A) NDF1 --> T2(Table B) NDF2 --> I2(Index B) DB --> LDF[Transaction Log File (.ldf)]
Simplified SQL Server Database File Architecture
Identifying Objects by Filegroup and File ID
SQL Server provides system views that allow you to query metadata about your database files and the objects within them. The sys.allocation_units
view is particularly useful as it links allocation units (which store data for tables and indexes) to filegroups and ultimately to the physical files. By joining this with sys.partitions
, sys.objects
, and sys.filegroups
, you can get a comprehensive view of data distribution.
SELECT
DB_NAME() AS DatabaseName,
f.name AS FilegroupName,
df.name AS DataFileName,
df.physical_name AS PhysicalFilePath,
o.name AS ObjectName,
p.index_id AS IndexID,
i.name AS IndexName,
au.type_desc AS AllocationUnitType,
SUM(au.data_pages) * 8 / 1024 AS UsedMB
FROM
sys.allocation_units AS au
INNER JOIN
sys.partitions AS p ON au.container_id = p.partition_id
INNER JOIN
sys.objects AS o ON p.object_id = o.object_id
INNER JOIN
sys.indexes AS i ON p.object_id = i.object_id AND p.index_id = i.index_id
INNER JOIN
sys.filegroups AS f ON au.data_space_id = f.data_space_id
INNER JOIN
sys.database_files AS df ON f.data_space_id = df.data_space_id
WHERE
o.is_ms_shipped = 0 -- Exclude system objects
AND au.type IN (1, 3) -- IN_ROW_DATA, LOB_DATA
GROUP BY
f.name, df.name, df.physical_name, o.name, p.index_id, i.name, au.type_desc
ORDER BY
FilegroupName, DataFileName, UsedMB DESC;
T-SQL query to list objects and their space usage per data file.
UsedMB
calculation SUM(au.data_pages) * 8 / 1024
converts the number of 8KB data pages into megabytes. This gives you an approximate size of the data allocated to the object within that file.Drilling Down to Specific Files
If you want to investigate a specific data file, you can filter the previous query by the file's name or file_id
. This is particularly useful when you suspect a particular file is growing unexpectedly or is experiencing high I/O. You can find the file_id
from sys.database_files
.
DECLARE @TargetFileName NVARCHAR(128) = 'YourDataFile_01'; -- Replace with your file name
SELECT
DB_NAME() AS DatabaseName,
f.name AS FilegroupName,
df.name AS DataFileName,
df.physical_name AS PhysicalFilePath,
o.name AS ObjectName,
p.index_id AS IndexID,
i.name AS IndexName,
au.type_desc AS AllocationUnitType,
SUM(au.data_pages) * 8 / 1024 AS UsedMB
FROM
sys.allocation_units AS au
INNER JOIN
sys.partitions AS p ON au.container_id = p.partition_id
INNER JOIN
sys.objects AS o ON p.object_id = o.object_id
INNER JOIN
sys.indexes AS i ON p.object_id = i.object_id AND p.index_id = i.index_id
INNER JOIN
sys.filegroups AS f ON au.data_space_id = f.data_space_id
INNER JOIN
sys.database_files AS df ON f.data_space_id = df.data_space_id
WHERE
o.is_ms_shipped = 0
AND au.type IN (1, 3)
AND df.name = @TargetFileName -- Filter by specific file name
GROUP BY
f.name, df.name, df.physical_name, o.name, p.index_id, i.name, au.type_desc
ORDER BY
UsedMB DESC;
T-SQL query to find objects within a specific data file.
UsedMB
calculation provides the space allocated to data pages. It does not account for free space within pages or unallocated space within extents. For a more precise measure of actual data size, consider using sp_spaceused
or sys.dm_db_partition_stats
.Using sys.dm_db_partition_stats
for Detailed Space Usage
For a more granular view of space usage per partition (which corresponds to a table or index), sys.dm_db_partition_stats
is an excellent dynamic management view. While it doesn't directly link to physical files, it provides detailed information about allocated and used space for each partition, which can then be correlated with the filegroup information from sys.partitions
.
SELECT
DB_NAME() AS DatabaseName,
o.name AS ObjectName,
i.name AS IndexName,
p.partition_number AS PartitionNumber,
fg.name AS FilegroupName,
SUM(ps.in_row_data_page_alloc_count + ps.lob_data_page_alloc_count + ps.row_overflow_data_page_alloc_count) * 8 / 1024 AS AllocatedMB,
SUM(ps.used_page_count) * 8 / 1024 AS UsedMB
FROM
sys.dm_db_partition_stats AS ps
INNER JOIN
sys.partitions AS p ON ps.partition_id = p.partition_id
INNER JOIN
sys.objects AS o ON p.object_id = o.object_id
INNER JOIN
sys.indexes AS i ON p.object_id = i.object_id AND p.index_id = i.index_id
INNER JOIN
sys.filegroups AS fg ON i.data_space_id = fg.data_space_id
WHERE
o.is_ms_shipped = 0
GROUP BY
o.name, i.name, p.partition_number, fg.name
ORDER BY
AllocatedMB DESC;
T-SQL query using sys.dm_db_partition_stats
to show space usage per object and filegroup.