Thursday 5 March 2015

Bulk Indexing documents to Multiple Libraries using Powershell

Suppose we have a Document Library with large no. documents and we want to update the metadata which is kept in an excel file from local drive using PowerShell. Like this we have Multiple document Libraries and Metadata templates. Below is the process to iterate through each document library with its excel templates.
Steps

As all of you know there is no direct way to update metadata from Excel. So we have to convert excel to CSV format. And avoid space in column names in the CSV file.
1. Save as excel file to CSV format
2. Create a folder structure in windows drive to store the list templates in CSV format.


3. Create a list in SharePoint to store the folder path for each CSV template corresponding to each Library. Here the List name is 'IndexList'
4. Map the column names such as LibraryName, LibraryUrlName and TemplatePath in the script
5. Map the column names from CSV to the corresponding column names from SharePoint Library in the script.


if((Get-PSSnapin "Microsoft.SharePoint.PowerShell") -eq $null

{

Add-PSSnapin Microsoft.SharePoint.PowerShell

} 

$url = "http://sitename:portno/"

$OpenWeb = Get-SPWeb $url

$DocumentIndexing = $OpenWeb.Lists["IndexList"]

ForEach($item in $DocumentIndexing.Items)

{

$docLibraryName = $item["LibraryName"]

$docLibraryUrlName = $item["LibraryUrlName"]

$localFolderPath = $item["LinkTitle"]

$docLibrary = $OpenWeb.Lists[$docLibraryName]

Write-Host $localFolderPath

$file = Get-Content "$localFolderPath"

#Open file

function Update-SPList()

{

$data = Import-Csv $localFolderPath

$OpenWeb = Get-SPWeb $url

$folder = $OpenWeb.getfolder($docLibraryUrlName)

Write-Host $folder

foreach($row in $data)

{

#CAML query compares LinkFilename in the Library and Name Column in the CSV

$spQuery = New-Object Microsoft.SharePoint.SPQuery

$camlQuery = '<Where><Eq><FieldRef Name="LinkFilename" /><Value Type="Text">'+ $row.Name +'</Value></Eq></Where>'

$spQuery.Query = $camlQuery

$listItems = $docLibrary.GetItems($spQuery)

foreach($item in $listItems)

{

# List Out all column names from all CSV Templates of without Duplication and pass the value to Library columns.

   Write-Host "Updating" $item["LinkFilename"] "with" $row.Name

   $item["Content Type"] = $row.Document_Category

   $item["DATE_x0028_MM_x002f_YY_x0029_"] = $row.DATE

   $item["Location"] = $row.Location

   $item["Licence_x0020_No"] = $row.Licence_No

   $item["Center"]=$row.Center

   $item["Ticket"]=$row.Ticket_No

   $item["Amount"]=$row.Amount

   $item["Course"]=$row.Course

   $item.Update()

}

#Clear all rows except the CSV column Names

(Get-Content $localFolderPath |  Select -First 1) | Out-File $localFolderPath 

}
}

Update-SPList

if($OpenWeb)

{

$OpenWeb.Dispose()

}

}


Once it is updated the rows will be automatically deleted from the CSV template .