StorageAnalyzer

StorageAnalyzer class

class sqliteanalyzer.StorageAnalyzer(db_path: str)[source]

Extracts storage-space usage statistics from an SQLite3 database.

It uses as a starting point the metrics provided by the DBSTAT virtual table.

Parameters:db_path – path to an SQLite3 database file

Note

SQLite3 must have been compiled with the -DSQLITE_ENABLE_DBSTAT_VTAB flag enabled.

References

https://www.sqlite.org/dbstat.html

item_count() → int[source]

Number of rows defined in table SQLITE_MASTER.

Returns:SELECT COUNT(*) from SQLITE_MASTER
file_size() → int[source]

Physical size of the database in bytes, as reported by os.stat().

Returns:Size of the database [bytes]
logical_file_size() → int[source]

Number of bytes that the database should take given the size of a page and the number of pages it contains.

If there is no compression, then this value is equal to the physical file size (file_size()).

Returns:Logical size of the database [bytes]
page_size() → int[source]

Size in bytes of the database pages.

Returns:PRAGMA page_size [bytes]
page_count() → int[source]

Number of reported pages in the database.

Returns:PRAGMA page_count
calculated_free_pages() → int[source]

Number of free pages.

Returns:page_count() - in_use_pages() - autovacuum_page_count()
calculated_page_count() → int[source]

Number of calculated pages in the database.

Returns

The sum of pages in use, pages in the freelist and pages in the autovacuum pointer map.

page_count() + in_use_pages() + autovacuum_page_count()

freelist_count() → int[source]

Number of pages in the freelist.

Those are unused pages in the database.

Returns:PRAGMA freelist_count
pages() → [<class 'sqliteanalyzer.types.Page'>][source]

Returns the definition for all pages in the database.

It is a dump of the DBSTAT virtual table.

Reference:
https://www.sqlite.org/dbstat.html
Returns:a list of Page objects
in_use_pages() → int[source]

Number of pages currently in use.

Returns:leaf_pages + internal_pages + overflow_pages
in_use_percent() → float[source]

Percentage of pages from the total that are currently in use.

Returns:% of pages of the DB that are currently in use
tables() → [<class 'str'>][source]

Names of the tables defined in the database.

Returns:tables in the database
indices() → [<class 'sqliteanalyzer.types.Index'>][source]

Returns the indices defined in the database.

Returns:a list of Index
index_list(table: str) → [<class 'sqliteanalyzer.types.IndexListEntry'>][source]

Given a table, returns its entries in PRAGMA index_list.

Returns:A list of IndexListEntry namedtuples.

References

https://sqlite.org/pragma.html#pragma_index_list

ntable() → int[source]

Number of tables in the database.

nindex() → int[source]

Number of indices in the database.

nautoindex() → int[source]

Number of automatically-created indices in the database.

nmanindex() → int[source]

Number of manually-created indices in the database.

payload_size() → int[source]

Space in bytes used by the user’s payload.

It does not include the space used by the sqlite_master table nor any indices.

is_compressed() → bool[source]

Returns whether the database file is compressed.

autovacuum_page_count() → int[source]

The number of pages used by the auto-vacuum pointer map.

table_space_usage() → {}[source]

Space used by each table in the database.

Returns:A dictionary from table names to page counts.
table_page_count(name: str, exclude_indices=False) → int[source]

Number of pages that the table is currently using.

If exclude_indices == True, then it does not count those pages taken by indices that might point to that table.

Parameters:
  • name – name of the table
  • exclude_indices – whether to avoid counting pages used
  • indices on the table. (by) –
index_page_count(name: str) → int[source]

Number of pages that the index is currently using.

Parameters:name – name of the index
Returns:number of pages
index_stats(name: str) → sqliteanalyzer.storageanalyzer.StorageMetrics[source]

Returns statistics for the index.

Parameters:name – name of the index
Returns:a StorageMetrics object
table_stats(name: str, exclude_indices=False) → sqliteanalyzer.storageanalyzer.StorageMetrics[source]

Returns statistics for a table.

The value of the optional parameter exclude_indices, determines whether indices are considered part of the actual table or not.

Parameters:name – name of the table
Returns:a StorageMetrics object
global_stats(exclude_indices=False) → sqliteanalyzer.storageanalyzer.StorageMetrics[source]

Storage metrics for all tables and/or indices in the database

The value of the optional parameter exclude_indices determines whether indices are considered.

Parameters:exclude_indices – bool: if False, space used by indices is not considered.
Returns:a StorageMetrics object
indices_stats() → sqliteanalyzer.storageanalyzer.StorageMetrics[source]

Return metadata about the indices in the database.

Raises:ValueError – If no indices exist
is_without_rowid(table: str) → bool[source]

Returns whether the given table is a WITHOUT ROWID table.

Parameters:table – name of the table

References

https://sqlite.org/withoutrowid.html

stat_db_dump() → [<class 'str'>][source]

Returns a dump of the DB containing the stats.

Returns:list of lines containing an SQL dump of the stat database.

Return types

class sqliteanalyzer.Index(name, table)
name

str – Name of the index

table

str – Table to which the index points

class sqliteanalyzer.IndexListEntry(seq, name, unique, origin, partial)
name

str – name of the index

origin

str – How was the index created. c if it was created by a CREATE_INDEX statement, u if created by a UNIQUE constraint, or pk if created by a PRIMARY_KEY constraint

partial

bool – whether the index covers only a subset of rows of a table

seq

int – internal sequence number of the index

unique

bool – whether index is UNIQUE

class sqliteanalyzer.Page(name, path, pageno, pagetype, ncell, payload, unused, mx_payload, pgoffset, pgsize)
mx_payload

Largest payload size of all cells on the page

name

Name of table or index

ncell

Cells on page (0 for overflow)

pageno

Page number

pagetype

'internal', 'leaf' or 'overflow'

path

Path from the root to the page

payload

Bytes of payload on the page

pgoffset

Offset of the page in the file

pgsize

Size of the page

unused

Bytes of unused space on the page

class sqliteanalyzer.StorageMetrics(*args, **kwargs)[source]

Storage metrics for a given database object.

It contains the following keys:

  • 'nentry'
  • 'payload'
  • 'ovfl_payload'
  • 'mx_payload'
  • 'ovfl_cnt'
  • 'leaf_pages'
  • 'int_pages'
  • 'ovfl_pages'
  • 'leaf_unused'
  • 'int_unused'
  • 'ovfl_unused'
  • 'gap_cnt'
  • 'compressed_size'
  • 'depth'
  • 'cnt'
  • 'total_pages'
  • 'total_pages_percent'
  • 'storage'
  • 'is_compressed'
  • 'compressed_overhead'
  • 'payload_percent'
  • 'total_unused'
  • 'total_metadata'
  • 'metadata_percent'
  • 'average_payload'
  • 'average_unused'
  • 'average_metadata'
  • 'ovfl_percent'
  • 'fragmentation'
  • 'int_unused_percent'
  • 'ovfl_unused_percent'
  • 'leaf_unused_percent'
  • 'total_unused_percent