Monday, March 17, 2014

Enumerating and Scanning a Large Number of Files recursively with Powershell for a pattern

Useing Powershell 3.0,

I found a large discrepency in terms of performance using the Powershell Commandlets, how you call them, and their execution vs .NET Classes in System. IO namespace.

Scenario:
Scanning a large number of files recursively (approximately 32,000 files across 238 folders) to match filename to a pattern.  (of which 5 files match the pattern for this example)

Code sample using powershell Get-childitem commandlet:

 
$userID = "TestUser"
allContents = (gci \\Server\share -Recurse | where {($_ -like "*$($userID).*")
This command was taking 3-5mins to process, totally unacceptable processing time on something that must be rerun continuously..

Upon examing this code, I am trying to first create a powershell object with 32K+ "FileInfo" Objects items, then filtering the results with a Where-object command on the pipeline.

Re-writing the Powershell commandlet like this increased performance dramatically..
$allContents = gci "\\nw-svc-bu01\vol1\WSINFO" -Filter "*SINESC.*" -Recurse
allContents = gci "\\nw-svc-bu01\vol1\WSINFO" -Filter "*SINESC.*" -Recurse
$allContents = gci "\\server\share" -Filter "*$($userID).*" -Recurse



This command returns results in 10 seconds.  This command 'skims' the file directory structure, and only returns files that match the filter.  The more files that match the filter, the slower the performance will be because it still returns an array of "FileInfo" objects.  So the creating of the "FileInfo" objects are slow point, not the scanning of the folders recursively.

Code sample using .NET Class:

 
[void] [System.Reflection.Assembly]::LoadWithPartialName("System.IO")

$userID = "TestUser"
$allConents = ((([IO.Directory]::EnumerateFiles(\\server\share,"*$($userID).*",[System.IO.SearchOption]::AllDirectories)) | out-string).trim()).Split()


This command returns results in 7 seconds.  The EnumerateFiles Method returns a System.Collections.Generic.IEnumerable Object.  Which only contains the full file path of the files into a collection.

So if you need performance it appears that .NET Classes are slightly superior, if you only need the file path returned for your query.  It also points out that how you call and execute powershell commandlets can impact performance dramatically as well.

Powershell Function to set account enabled for non-AD LDAP (novell in my case)

Function SetEnableLDAPUserAccount {

<#

.DESCRIPTION

Sets an LDAP User object enabled or disabled

.EXAMPLES

SetLDAPUserPassword -LDAPServer "LDAPSERVER" -LDAPPort 636 -SSL $true -targetUserDN "cn=SMITHJO1.ou=test,o=TESTTREE" -AccountDisabled $False -AuthUserName "cn=admin,o=TESTTREE" -SecPassWord $NovellCred.Password


#>

param([string]$LDAPServer,[int]$LDAPPort,[boolean]$SSL,[string]$targetUserDN,[boolean]$AccountDisabled,[string]$AuthUserName,[securestring]$SecPassWord )

#Load the assemblies

[System.Reflection.Assembly]::LoadWithPartialName("System.DirectoryServices.Protocols") | Out-Null

[System.Reflection.Assembly]::LoadWithPartialName("System.Net") | Out-Null

$Error.Clear()

Try {

#Connects to LDAP Server using specified port

$c = New-Object System.DirectoryServices.Protocols.LdapConnection "$($LDAPServer):$($LDAPPort)" -ea Stop

#Set session options

$c.SessionOptions.SecureSocketLayer = $SSL;

$c.SessionOptions.VerifyServerCertificate = { return $true;} #needed for self-signed certificates

# Pick Authentication type:

# Anonymous, Basic, Digest, DPA (Distributed Password Authentication),

# External, Kerberos, Msn, Negotiate, Ntlm, Sicily

$c.AuthType = [System.DirectoryServices.Protocols.AuthType]::Basic



#Creates a credential object to pass to bind to LDAP Connection Object

$NovellCredentials = new-object "System.Net.NetworkCredential" -ArgumentList $AuthUserName,$SecPassWord

# Bind with the network credentials. Depending on the type of server,

# the username will take different forms. Authentication type is controlled

# above with the AuthType

$c.Bind($NovellCredentials);

}

Catch

{

switch -Wildcard ($Error)

{

"*The supplied credential is invalid*" { "The Supplied LDAP Authentication Credentials for User: $($AuthUserName) were invalid." }

"*The LDAP server is unavailable*" {"Error Connecting to LDAP Server! Check that LDAP Server value of: $($LDAPServer) is correct, and available and responding on port: $($LDAPPort)"}

default {"An Unknown Error occured attempting to connect to LDAP Server $($LDAPServer) to $(if ($AccountDisabled -eq $True){"Disable"} Else {"Enable"}) User: $($targetUserDN)'s eDirectory account."}

}

Exit 1

}



#Creating an LDAP request Object

$r = (new-object "System.DirectoryServices.Protocols.ModifyRequest")

$r.DistinguishedName = $targetUserDN;

$m = New-Object "System.DirectoryServices.Protocols.DirectoryAttributeModification"

$m.Name = "loginDisabled"; #Attribute where the User's Password is stored, is a Write only attribute

$m.Operation = [System.DirectoryServices.Protocols.DirectoryAttributeOperation]::Replace

#add value(s) of the attribute

$m.Add($AccountDisabled.ToString().toUpper()) | Out-Null

$r.Modifications.Add($m) | Out-Null

$Error.Clear()

Try

{ #Actually Try to process the request through the server

$re = $c.SendRequest($r);

}

Catch

{

switch -Wildcard ($Error)

{

"*The user has insufficient access rights*" {"The LDAP User $($AuthUserName) doesn't appear to have rights to $(if ($AccountDisabled -eq $True){"Disable"} Else {"Enable"}) the user account: $($targetUserDN)."}

default {"An Unknown Error occured while attempting to $(if ($AccountDisabled -eq $True){"Disable"} Else {"Enable"}) the eDirectory LDAP User: $($targetUserDN)'s account."}

}

Exit 1

}

if ($re.ResultCode -ne [System.directoryServices.Protocols.ResultCode]::Success) {

$LDAPErr = "$(if ($AccountDisabled -eq $True){"Disabling"} Else {"Enabling"}) LDAP User: $($targetUserDN) account failed!

ResultCode: $($re.ResultCode)

Message: $($re.ErrorMessage)"

Return $LDAPErr

}

Else {

Return "$($re.ResultCode)! LDAP User: $($targetUserDN)'s account was $(if ($AccountDisabled -eq $True){"Disabled"} Else {"Enabled"})."

}

 

}
#End Function
#----------------------------------------------------------------------------------

$NovellCred = Get-Credential -Message "Enter your Novell Username and Password. Example username: smithj" # Getting the Novell Credential Information

$EDIRLDAPServer = "LDAPSERVER"
$EDIRBaseDN = "O=TEST"
$userID = $NovellCred.UserName.ToUpper()
$TrgtUserID = "smitha"
$TrgtUserPass = P@ssw0rd

$PRODLDAPObj = GetLDAPObject -LDAPServer $EDIRLDAPServer -LDAPPort 389 -SSL $false -baseDN $EDIRBaseDN -Filter "(uid=$($userID))"

$userDN = ""
$userDN = (($PRODLDAPObj | select "DistinguishedName" | Out-String).Split() | Select-String -Pattern '^(\w+[=]{1}\w+)([,{1}]\w+[=]{1}\w+)*$').ToString().ToUpper()

SetEnableLDAPUserAccount -LDAPServer $EDIRLDAPServer -LDAPPort 389 -SSL $false -targetUserDN $trgtUserDN -AccountDisabled $False -AuthUserName $userDN -SecPassWord $NovellCred.Password