Did you ever had the urge or need to get an overview of documents uploaded to your SharePoint installation.
One of our customers asked us to give them an overview of Microsoft Office files (xlsx, xls, doc, docx, ppt, pptx and pdf), and they were pretty surprised by the result, maybe you will be too.
As it was a pretty big SharePoint installation (by Danish standards), I was a little worried that it would take forever for a powershell script to traverse all sites and look in each document library for files. But then I realized, powershell is not the right tool for this task, it’s a lot better/faster to use the docs view of the content database.
The final script we used for extracting the statistics looked like this – feel free to modify it to include the file types that interests you. One note the script only looks in document libraries, so if you have document stored elsewhere this script alone might not be enough.
[sql]
use <NameOfContentDB>
SELECT DirName, LeafName, Size, RIGHT(LeafName, CHARINDEX(‘.’, REVERSE(‘.’ + LeafName)) – 1) as FileType
FROM Docs with (nolock) INNER JOIN Webs with (nolock) On Docs.WebId = Webs.Id
INNER JOIN Sites with (nolock) ON Webs.SiteId = Sites.Id
WHERE Docs.Type <> 1
AND (
(LeafName LIKE ‘%.doc’)
OR (LeafName LIKE ‘%.docx’)
OR (LeafName LIKE ‘%.xlsx’)
OR (LeafName LIKE ‘%.xls’)
OR (LeafName LIKE ‘%.pptx’)
OR (LeafName LIKE ‘%.ppt’)
OR (LeafName LIKE ‘%.pdf’)
)
AND (LeafName NOT LIKE ‘%template%’)
[/sql]
In the script you have to replace the ‘NameOfContentDB’ with the name of your content database. After this the script is ready. I did a little post-processing of the numbers in excel, by adding them to a pivot table, so they could be sorted by type. And I also converted the size in bytes to MB.