Payton Flint's Tech Blog
Menu
  • Home
  • Blog
  • Categories
  • Resources
  • About
  • Contact
Menu

Cloud – Return Enterprise-wide Google Drive File Metadata as SQL Database

Posted on December 18, 2024April 20, 2025 by paytonflint

If you are managing an enterprise, you’ll undoubtedly run into storage usage concerns. Cloud storage can be relatively expensive. Some good housekeeping is vital. But, how do you determine what to clean up? There are a lot of relevant data points you can use to identify anomalous use/abuse of available storage.

I’ve written a script to get each user within your Google Workspace, get each file within each user’s Drive, and return the file ID, ModifiedTime, Name, Owner, and Size. These properties give us sufficient information to locate files matching patterns of our choosing, and identify them for deletion or further investigation. The script finally installs SQLite using Chocolatey and generates a SQLite database on-the-fly to contain all of this file information and allow you to query it. Being able to use SQL to query this data is important for performance, and to give a lot of flexibility and ease in querying the data.

If your environment is of considerable size, this can be quite a lot of data. I used this script in a Workspace with over 100,000 users to collect over 8.7 million unique files’ metadata. I’ve been able to clean up terabytes, ultimately saving on costs. This is an elegant solution to get a birds-eye view of your enterprise’s storage usage within Google Workspace. With the built-in tools, this is simply not a possibility. Getting this kind of granular insight is crucial as an administrator to perform housekeeping, but it isn’t a simple task.

Here’s a link to the script on my GitHub: https://github.com/p8nflnt/Cloud-Toolbox/blob/main/Google/Get-GoogleDriveFileMetadataDb.ps1

<#
.SYNOPSIS
    - Get each user from Google Workspace via API
    - Get each user's drive file metadata from Google via API
    - Export all users' drive file metadata to .csv
    - Convert .csv to SQLite database for querying

.NOTES
    Name: Get-GoogleDriveFileMetadataDb.ps1
    Author: Payton Flint
    Version: 1.0
    DateCreated: 2024-Dec

.LINK
    https://github.com/p8nflnt/Cloud-Toolbox/blob/main/Google/Get-GoogleDriveFileMetadataDb.ps1
    
Cloud – Return Enterprise-wide Google Drive File Metadata as SQL Database
#> Function Test-ElevatedShell { # Check if the current user has administrative privileges $isAdmin = ([Security.Principal.WindowsPrincipal] [Security.Principal.WindowsIdentity]::GetCurrent()).IsInRole([Security.Principal.WindowsBuiltInRole]::Administrator) if (-not $isAdmin) { Write-Warning "You are not running this script with administrator privileges. Please restart the script as an administrator." return $false } } Function Add-NuGet { $package = Get-PackageSource -Name 'Nuget' -ErrorAction SilentlyContinue if ($package.IsTrusted -eq $False) { Write-Host "NuGet is installed, but is not trusted." Write-Host "Setting NuGet as trusted source." Set-PackageSource -Name 'Nuget' -Trusted -Force } elseif ($package -eq $null) { Write-Host "NuGet is not currently a registered source." Write-Host "Registering NuGet as trusted source." Register-PackageSource -Name Nuget -Location "https://www.nuget.org/api/v2" -ProviderName Nuget -Trusted -Force } else { Write-Host "NuGet is currently registered as a trusted source." } } Function Install-BouncyCastle { # Retrieve installed package information $bouncyCastle = Get-Package BouncyCastle -ErrorAction SilentlyContinue # If BouncyCastle package is not present... If (!($bouncyCastle)) { Write-Host "BouncyCastle not found, downloading..." # Install BouncyCastle package for cryptographic processing Install-Package BouncyCastle -ErrorAction SilentlyContinue } Else { # Locate BouncyCastle .DLL file $bouncyCastle = $bouncyCastle.Source | Split-Path $bouncyCastle = $(Get-ChildItem -Path $bouncyCastle -Recurse -Filter *.dll | Select-Object -First 1).FullName } # If BouncyCastle .DLL was found... If ($bouncyCastle) { Write-Host "BouncyCastle present, loading assembly to current session..." Add-Type -Path $bouncyCastle -ErrorAction Stop Write-Host "BouncyCastle assembly loaded into the current session successfully." } Else { Write-Host "No BouncyCastle .DLL file found." } # Return .DLL file path for reference post-install #return $bouncyCastle } Function Get-GoogleAccessToken { param ( [string]$scope, # OAuth permission scope(s) - multiple scopes should be space-separated [string]$keyFilePath, # Path to service account key file path [string]$user, # Subject - Email of the user to impersonate [int]$ttl # Token time-to-live in seconds (3600 default) ) # If expiration not specified, set default If (!($ttl)) { [int]$ttl = 3600 } # Get client_email from JSON key file $jsonContent = Get-Content -Raw -Path $keyFilePath | ConvertFrom-Json $svcAcct = $jsonContent.client_email # JWT Header $header = @{ alg = "RS256" typ = "JWT" } | ConvertTo-Json | Out-String # JWT Payload $now = [int](Get-Date -Date (Get-Date).ToUniversalTime() -UFormat %s) $exp = $now + $ttl # Token expiration $payload = @{ iss = $svcAcct scope = $scope # OAuth permission scope(s) aud = "https://oauth2.googleapis.com/token" # Audience sub = $user # Email of the user to impersonate iat = [math]::floor((Get-Date).ToUniversalTime().Subtract([datetime]'1970-01-01').TotalSeconds) exp = [math]::floor((Get-Date).ToUniversalTime().AddHours(1).Subtract([datetime]'1970-01-01').TotalSeconds) } | ConvertTo-Json -Compress # Function for Base64 URL-safe encoding function Encode-UrlBase64 { param([byte[]]$inputBytes) $base64 = [Convert]::ToBase64String($inputBytes).TrimEnd('=') $base64 = $base64.Replace('+', '-').Replace('/', '_') return $base64 } # Convert Header and Payload to Base64 $headerBase64 = Encode-UrlBase64 -inputBytes ([System.Text.Encoding]::UTF8.GetBytes($header)) $payloadBase64 = Encode-UrlBase64 -inputBytes ([System.Text.Encoding]::UTF8.GetBytes($payload)) # Extract private key from JSON file $pvtKeyString = $jsonContent.private_key -replace "-----BEGIN PRIVATE KEY-----", "" -replace "-----END PRIVATE KEY-----", "" -replace "\s+", "" $pvtKeyBytes = [Convert]::FromBase64String($pvtKeyString) # Convert the private key into an RSA key using BouncyCastle's PrivateKeyFactory $pvtKeyInfo = [Org.BouncyCastle.Asn1.Pkcs.PrivateKeyInfo]::GetInstance($pvtKeyBytes) $pvtKey = [Org.BouncyCastle.Security.PrivateKeyFactory]::CreateKey($pvtKeyInfo) # Create the signer object for RSA/SHA256 $signer = New-Object Org.BouncyCastle.Crypto.Signers.RsaDigestSigner ([Org.BouncyCastle.Crypto.Digests.Sha256Digest]::new()) $signer.Init($true, $pvtKey) # Create the unsigned JWT $unsignedJwt = "$headerBase64.$payloadBase64" # Sign the JWT $signer.BlockUpdate([System.Text.Encoding]::UTF8.GetBytes($unsignedJwt), 0, $unsignedJwt.Length) $signature = $signer.GenerateSignature() # Convert signature to URL-safe base64 $signatureBase64 = Encode-UrlBase64 -inputBytes $signature $jwt = "$unsignedJwt.$signatureBase64" # Exchange the JWT for an access token $requestUri = "https://oauth2.googleapis.com/token" $body = @{ grant_type = "urn:ietf:params:oauth:grant-type:jwt-bearer" assertion = $jwt } # POST JWT for access token $response = Invoke-RestMethod -Uri $requestUri -Method POST -Body $body -ContentType "application/x-www-form-urlencoded" # Output the access token return $response.access_token } # Retrieve all users from Admin SDK Function Get-AllUsers { param ( [string]$accessToken ) $users = @() $nextPageToken = $null do { $url = "https://admin.googleapis.com/admin/directory/v1/users?customer=my_customer&maxResults=500" if ($nextPageToken) { $url += "&pageToken=$nextPageToken" } $response = Invoke-RestMethod -Uri $url -Headers @{ Authorization = "Bearer $accessToken" Accept = "application/json" } -Method Get $users += $response.users $nextPageToken = $response.nextPageToken } while ($nextPageToken) return $users } # Query Google for user-owned drive files function Get-UserOwnedDriveFiles { param ( [string]$AccessToken, # Access token passed as a parameter [string]$ModifiedAfter, # DateTime in format '1/1/1970 00:00:00 AM' [string]$ModifiedBefore # DateTime in format '1/1/1970 00:00:00 AM' ) # Ensure only one of ModifiedAfter or ModifiedBefore is provided if ($ModifiedAfter -and $ModifiedBefore) { throw "You cannot specify both -ModifiedAfter and -ModifiedBefore. Please provide only one of these parameters." } # Nested helper function for ISO 8601 conversion function ConvertToISO8601 { param ([string]$DateTimeString) return ([datetime]::Parse($DateTimeString)).ToUniversalTime().ToString("yyyy-MM-ddTHH:mm:ssZ") } # Initialize query components $query = @() # Add ownership filter $query += "'me' in owners" # Convert dates and build query components only if parameters are specified if ($ModifiedAfter) { $ModifiedAfterISO = ConvertToISO8601 -DateTimeString $ModifiedAfter $query += "modifiedTime > '$ModifiedAfterISO'" } if ($ModifiedBefore) { $ModifiedBeforeISO = ConvertToISO8601 -DateTimeString $ModifiedBefore $query += "modifiedTime < '$ModifiedBeforeISO'" } # Determine if a query is needed $queryString = if ($query.Count -gt 0) { "q=" + [System.Uri]::EscapeDataString($query -join " and ") } else { "" } # Construct the base URI $baseUri = "https://www.googleapis.com/drive/v3/files" $fieldsParam = "fields=nextPageToken,files(id,modifiedTime,name,owners,size)" $files = @() $pageToken = $null do { # Construct URI for this iteration $uri = if ($queryString -ne "") { "$baseUri`?$queryString`&$fieldsParam" } else { "$baseUri`?$fieldsParam" } if ($pageToken) { $uri += "&pageToken=$pageToken" } # Make the API call try { $response = Invoke-RestMethod -Uri $uri ` -Headers @{ "Authorization" = "Bearer $AccessToken" } ` -Method Get $files += $response.files $pageToken = $response.nextPageToken } catch { Write-Host "Error: $($_.Exception.Message)" -ForegroundColor Red # Check if Response exists and try to capture the content early if ($_.Exception.Response -is [System.Net.Http.HttpResponseMessage]) { $responseObject = $_.Exception.Response try { $responseBody = $responseObject.Content.ReadAsStringAsync().Result Write-Host "Response Body: $responseBody" -ForegroundColor Red } catch { Write-Host "Unable to read response content." -ForegroundColor Yellow } } else { Write-Host "No response content available." -ForegroundColor Yellow } break } } while ($pageToken) # Get owner email address from $_.owners and write as single user/property ForEach ($file in $files) { $file | Add-Member -MemberType NoteProperty -Name owner -Value $file.Owners[0].EmailAddress -Force } # Return results return $files | Select-Object -Property ID, ModifiedTime, Name, Owner, Size } function Install-SQLite { # Check if the current user has administrative privileges Function Test-ElevatedShell { $isAdmin = ([Security.Principal.WindowsPrincipal] [Security.Principal.WindowsIdentity]::GetCurrent()).IsInRole([Security.Principal.WindowsBuiltInRole]::Administrator) if (-not $isAdmin) { Write-Warning "You are not running this script with administrator privileges. Please restart the script as an administrator." return $false } } # Check if SQLite is already installed $sqliteInstalled = (Get-Command sqlite3 -ErrorAction SilentlyContinue) -ne $null if (-not $sqliteInstalled) { # Ensure script is run as admin if (Test-ElevatedShell) { # Ensure Chocolatey is installed $chocoInstalled = (Get-Command choco -ErrorAction SilentlyContinue) -ne $null if (-not $chocoInstalled) { Write-Host "Installing Chocolatey..." Invoke-Expression (New-Object Net.WebClient).DownloadString('https://chocolatey.org/install.ps1') } # Install SQLite using Chocolatey Write-Host "Installing SQLite..." choco install sqlite -y Write-Host "Setup complete. SQLite version: $(sqlite3 --version)" -ForegroundColor Cyan } } } function Convert-CsvToSQLite { param ( [Parameter(Mandatory = $true)] [string]$csvPath, # Mandatory path to csv file [Parameter(Mandatory = $true)] [string]$dbPath, # Mandatory path for db export [string]$tableName = "Data" # Table name ) # Validate filename/extension $fileName = [System.IO.Path]::GetFileName($dbPath) $extension = [System.IO.Path]::GetExtension($dbPath) if (-not $fileName -or $extension -ne ".db") { Write-Error "The specified file path does not include a file name with an extension." } # Get columns/schema from .csv Write-Host "Analyzing CSV file to generate schema..." $csvData = Import-Csv -Path $csvPath $columns = $csvData[0].PSObject.Properties.Name $columnDefinitions = $columns | ForEach-Object { "[$_] TEXT" } $columnDefinitions = $columnDefinitions -join ", " Write-Host "Detected columns: $($columns -join ', ')" # Create sqlite db & table Write-Host "Creating SQLite database, `"$dbPath`" and table, `"Data`"" $tableCreationCommand = "CREATE TABLE IF NOT EXISTS $tableName ($columnDefinitions);" sqlite3 $dbPath $tableCreationCommand # Import data into the database using .import Write-Host "Populating SQLite database with data from CSV..." $importScript = @" .mode csv .import '$csvPath' $tableName "@ # Execute the script using the SQLite command-line tool $tempFile = [System.IO.Path]::GetTempFileName() Set-Content -Path $tempFile -Value $importScript sqlite3 $dbPath ".read $tempFile" # Remove the temporary file Remove-Item -Path $tempFile -Force Write-Host "CSV data successfully imported into SQLite database: $dbPath" } $keyFilePath = <KEY FILE PATH> # .json key file path $initUser = <USERNAME> # User for generating the token to retrieve all users $csvReportPath = <CSV FILE PATH> # .csv out file path $dbPath = <SQLITE DB PATH> # .db out file path (sqlite) # Only proceed if executed with elevated privileges if (Test-ElevatedShell) { # Add NuGet repository if it is not already configured Add-NuGet # Install BouncyCastle .DLL for cryptographic processing Install-Package BouncyCastle # Install BouncyCastle .DLL and get path for reference Install-BouncyCastle # Provide scope for initial token creation $tokenScope = "https://www.googleapis.com/auth/admin.directory.user.readonly" # Get new access token from Google for user $accessToken = Get-GoogleAccessToken -scope $tokenScope -keyFilePath $keyFilePath -user $initUser # Retrieve all users $users = Get-AllUsers -accessToken $accessToken # Initialize array, counter, adjust token scope for loop $report = @() $i = 0 $tokenScope = "https://www.googleapis.com/auth/drive.metadata.readonly" # Loop through each user to get their Drive usage foreach ($user in $users) { # Increment counter & print info to console $i++ Write-Host "User Number: $i of $($users.count)" -ForegroundColor Cyan Write-Host "Username: $user.primaryEmail" -ForegroundColor Cyan # Get new access token from Google for user $accessToken = $null $accessToken = Get-GoogleAccessToken -scope $tokenScope -keyFilePath $keyFilePath -user $user.primaryEmail # Initialize & get user's drive files $files = $null $files = Get-UserOwnedDriveFiles -AccessToken $AccessToken # If files are found, append to .csv if ($files.Count -gt 0) { $files | Export-Csv -Path $csvReportPath -Append -NoTypeInformation } # Print user's file count to console Write-Host "File Count: $($files.Count)" -ForegroundColor Green } # Install sqlite Install-SQLite # Convert .csv file to sqlite db Convert-CsvToSQLite -csvPath $csvReportPath -dbPath $dbPath }

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

About The Author

Author's Portrait

In my journey as a technologist and 11 years of experience as an IT professional, I have found my niche as Director of Infrastructure Services; developing my skillsets in management, scripting, cloud infrastructure, identity management, and networking.

I have experience as a Systems Administrator and Engineer for large enterprises including the DoD, government agencies, and a nuclear-generation site.

I've been blessed to collaborate with engineers at esteemed Fortune 50 corporations, and one of Africa's largest, to ensure successful implementation of my work.

GitHub Button

Credentials

M365 Endpoint Administrator Associate
M365 Fundamentals
Microsoft AZ-900
CompTIA CSIS
CompTIA CIOS
CompTIA Security+
CompTIA Network+
CompTIA A+
  • April 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
© 2022 Payton Flint | The views and opinions expressed on this website belong solely to the author/owner and do not represent the perspectives of any individuals, institutions, or organizations, whether affiliated personally or professionally, unless explicitly stated otherwise. The content and products on this website are provided as-is with no warranties or guaranties, are for informational/demonstrative purposes only, do not constitute professional advice, and are not to be used maliciously. The author/owner is not responsible for any consequences arising from actions taken based on information provided on this website, nor from the use/misuse of products from this site. All trademarks are the property of their respective owners.