How to find largest objects in a SQL Server database?


Translate

How would I go about finding the largest objects in a SQL Server database? First, by determining which tables (and related indices) are the largest and then determining which rows in a particular table are largest (we're storing binary data in BLOBs)?

Are there any tools out there for helping with this kind of database analysis? Or are there some simple queries I could run against the system tables?


All Answers
  • Translate

    I've been using this SQL script (which I got from someone, somewhere - can't reconstruct who it came from) for ages and it's helped me quite a bit understanding and determining the size of indices and tables:

    SELECT 
        t.name AS TableName,
        i.name as indexName,
        sum(p.rows) as RowCounts,
        sum(a.total_pages) as TotalPages, 
        sum(a.used_pages) as UsedPages, 
        sum(a.data_pages) as DataPages,
        (sum(a.total_pages) * 8) / 1024 as TotalSpaceMB, 
        (sum(a.used_pages) * 8) / 1024 as UsedSpaceMB, 
        (sum(a.data_pages) * 8) / 1024 as DataSpaceMB
    FROM 
        sys.tables t
    INNER JOIN      
        sys.indexes i ON t.object_id = i.object_id
    INNER JOIN 
        sys.partitions p ON i.object_id = p.object_id AND i.index_id = p.index_id
    INNER JOIN 
        sys.allocation_units a ON p.partition_id = a.container_id
    WHERE 
        t.name NOT LIKE 'dt%' AND
        i.object_id > 255 AND  
        i.index_id <= 1
    GROUP BY 
        t.name, i.object_id, i.index_id, i.name 
    ORDER BY 
        object_name(i.object_id) 
    

    Of course, you can use another ordering criteria, e.g.

    ORDER BY SUM(p.rows) DESC
    

    to get the tables with the most rows, or

    ORDER BY SUM(a.total_pages) DESC
    

    to get the tables with the most pages (8K blocks) used.


  • Translate

    In SQL Server 2008, you can also just run the standard report Disk Usage by Top Tables. This can be found by right clicking the DB, selecting Reports->Standard Reports and selecting the report you want.


  • Translate

    This query help to find largest table in you are connection.

    SELECT  TOP 1 OBJECT_NAME(OBJECT_ID) TableName, st.row_count
    FROM sys.dm_db_partition_stats st
    WHERE index_id < 2
    ORDER BY st.row_count DESC
    

  • Translate

    You may also use the following code:

    USE AdventureWork
    GO
    CREATE TABLE #GetLargest 
    (
      table_name    sysname ,
      row_count     INT,
      reserved_size VARCHAR(50),
      data_size     VARCHAR(50),
      index_size    VARCHAR(50),
      unused_size   VARCHAR(50)
    )
    
    SET NOCOUNT ON
    
    INSERT #GetLargest
    
    EXEC sp_msforeachtable 'sp_spaceused ''?'''
    
    SELECT 
      a.table_name,
      a.row_count,
      COUNT(*) AS col_count,
      a.data_size
      FROM #GetLargest a
         INNER JOIN information_schema.columns b
         ON a.table_name collate database_default
         = b.table_name collate database_default
           GROUP BY a.table_name, a.row_count, a.data_size
           ORDER BY CAST(REPLACE(a.data_size, ' KB', '') AS integer) DESC
    
    DROP TABLE #GetLargest
    

  • Translate

    If you are using Sql Server Management Studio 2008 there are certain data fields you can view in the object explorer details window. Simply browse to and select the tables folder. In the details view you are able to right-click the column titles and add fields to the "report". Your mileage may vary if you are on SSMS 2008 express.


  • Translate

    I've found this query also very helpful in SqlServerCentral, here is the link to original post

    Sql Server largest tables

      select name=object_schema_name(object_id) + '.' + object_name(object_id)
    , rows=sum(case when index_id < 2 then row_count else 0 end)
    , reserved_kb=8*sum(reserved_page_count)
    , data_kb=8*sum( case 
         when index_id<2 then in_row_data_page_count + lob_used_page_count + row_overflow_used_page_count 
         else lob_used_page_count + row_overflow_used_page_count 
        end )
    , index_kb=8*(sum(used_page_count) 
        - sum( case 
               when index_id<2 then in_row_data_page_count + lob_used_page_count + row_overflow_used_page_count 
            else lob_used_page_count + row_overflow_used_page_count 
            end )
         )    
    , unused_kb=8*sum(reserved_page_count-used_page_count)
    from sys.dm_db_partition_stats
    where object_id > 1024
    group by object_id
    order by 
    rows desc   
    

    In my database they gave different results between this query and the 1st answer.

    Hope somebody finds useful


  • Translate

    @marc_s's answer is very great and I've been using it for few years. However, I noticed that the script misses data in some columnstore indexes and doesn't show complete picture. E.g. when you do SUM(TotalSpace) against the script and compare it with total space database property in Management Studio the numbers don't match in my case (Management Studio shows larger numbers). I modified the script to overcome this issue and extended it a little bit:

    select
        tables.[name] as table_name,
        schemas.[name] as schema_name,
        isnull(db_name(dm_db_index_usage_stats.database_id), 'Unknown') as database_name,
        sum(allocation_units.total_pages) * 8 as total_space_kb,
        cast(round(((sum(allocation_units.total_pages) * 8) / 1024.00), 2) as numeric(36, 2)) as total_space_mb,
        sum(allocation_units.used_pages) * 8 as used_space_kb,
        cast(round(((sum(allocation_units.used_pages) * 8) / 1024.00), 2) as numeric(36, 2)) as used_space_mb,
        (sum(allocation_units.total_pages) - sum(allocation_units.used_pages)) * 8 as unused_space_kb,
        cast(round(((sum(allocation_units.total_pages) - sum(allocation_units.used_pages)) * 8) / 1024.00, 2) as numeric(36, 2)) as unused_space_mb,
        count(distinct indexes.index_id) as indexes_count,
        max(dm_db_partition_stats.row_count) as row_count,
        iif(max(isnull(user_seeks, 0)) = 0 and max(isnull(user_scans, 0)) = 0 and max(isnull(user_lookups, 0)) = 0, 1, 0) as no_reads,
        iif(max(isnull(user_updates, 0)) = 0, 1, 0) as no_writes,
        max(isnull(user_seeks, 0)) as user_seeks,
        max(isnull(user_scans, 0)) as user_scans,
        max(isnull(user_lookups, 0)) as user_lookups,
        max(isnull(user_updates, 0)) as user_updates,
        max(last_user_seek) as last_user_seek,
        max(last_user_scan) as last_user_scan,
        max(last_user_lookup) as last_user_lookup,
        max(last_user_update) as last_user_update,
        max(tables.create_date) as create_date,
        max(tables.modify_date) as modify_date
    from 
        sys.tables
        left join sys.schemas on schemas.schema_id = tables.schema_id
        left join sys.indexes on tables.object_id = indexes.object_id
        left join sys.partitions on indexes.object_id = partitions.object_id and indexes.index_id = partitions.index_id
        left join sys.allocation_units on partitions.partition_id = allocation_units.container_id
        left join sys.dm_db_index_usage_stats on tables.object_id = dm_db_index_usage_stats.object_id and indexes.index_id = dm_db_index_usage_stats.index_id
        left join sys.dm_db_partition_stats on tables.object_id = dm_db_partition_stats.object_id and indexes.index_id = dm_db_partition_stats.index_id
    group by schemas.[name], tables.[name], isnull(db_name(dm_db_index_usage_stats.database_id), 'Unknown')
    order by 5 desc
    

    Hope it will be helpful for someone. This script was tested against large TB-wide databases with hundreds of different tables, indexes and schemas.