Filtering data, from log or config files to data returned by an api, is an important operation to remove noise from it and make further analysis possible.
In Bash most would gravitate towards grep
, egrep
or awk
.
Classic examples would be
-
looking at your history
$ history | grep ssh ssh marco@0.0.0.0 ssh deploy@0.0.0.0 ssh admin@0.0.0.0
-
looking for errors in a log
$ cat /var/log/app.log | grep error 2020-09-30 09:01:17 error: twitter api responded with 403 2020-09-30 09:16:05 error: secret plan to take over the world failed
grep
$ grep api app.conf
api_key_twitter: '***'
api_key_digitalocean: '***'
This will list all settings containing api
in the configuration of my fictitious app.1
With PowerShell we will use Select-String
and define our search term in the -Pattern
parameter.
PS> Select-String -Path "app.conf" -Pattern "api"
app.conf:2:api_key_twitter: '***'
app.conf:4:api_key_digitalocean: '***'
Note
PowerShell includes the file and line of the match by default, if you want to disable this add -Raw
to your command.
grep -v
$ grep -v api app.conf
# Super awesome app config!
super_secret_setting: 'the cake is a lie!'
This will list all settings not containing api
in the configuration of my fictitious app.1
PowerShell can replicate this with the -NotMatch
parameter
PS> Select-String -Path "app.conf" -Pattern "api" -NotMatch
app.conf:1:# Super awesome app config!
app.conf:3:super_secret_setting: 'the cake is a lie!'
RegEx Patterns
PowerShell natively supports regular expressions, so writing more complicated filters is no problem.
PS> Select-String -Path "app.conf" -Pattern "^\w*:"
app.conf:2:api_key_twitter: '***'
app.conf:3:super_secret_setting: 'the cake is a lie!'
app.conf:4:api_key_digitalocean: '***'
Multiple Files
$ grep super_secret_setting *.py
debug.py:print(conf.super_secret_setting)
server.py:# TODO: Maybe we should not print super_secret_setting
server.py:#print(conf.super_secret_setting)
This will list all python code using my super_secret_setting
.
PowerShell can replicate this with a similar syntax.
PS> Select-String -Path "*.py" -Pattern "super_secret_setting"
debug.py:7:print(conf.super_secret_setting)
server.py:14:# TODO: Maybe we should not print super_secret_setting
server.py:15:#print(conf.super_secret_setting)
Searching through files recursively is really simple in Bash too.
$ grep -r super_secret_setting *.py
debug.py:print(conf.super_secret_setting)
server.py:# TODO: Maybe we should not print super_secret_setting
server.py:#print(conf.super_secret_setting)
utils.secrets.py:if conf.super_secret_setting:
Unfortunately this is not as simple with PowerShell as you need to chain multiple commands together.
Get-ChildItem -Recurse -Include "*.py" | Select-String -Pattern "super_secret_setting"
debug.py:7:print(conf.super_secret_setting)
server.py:14:# TODO: Maybe we should not print super_secret_setting
server.py:15:#print(conf.super_secret_setting)
utils/secrets.py:4:if conf.super_secret_setting:
Tipps & Tricks
This is a new section I wanted to introduce to give light to some minor features not worth a full example, but still interesting and useful from time to time.
- Sometimes you do not require all matches for something but rather a quick scan of files containing at least one match.
This can be achieved with the
-List
switch - Other times you may require getting all matches - even multiple ones per line.
-AllMatches
will provide you with just that - Search case sensitive by using
-CaseSensitive
Shortcuts
Writing out Select-String
every time is nothing to be desired, thankfully we can take some shortcuts and even bring in our Bash muscle memory.
- Shorten
Select-String
by using it’s aliassls
- Just like grep
-Pattern
can be provided as positional argument 0-Path
can be provided as positional argument 1
- Omit
""
around your strings as long as they have no spaces inside - Commands are always case insensitive, this is more of a personal preference
With this shortcuts our commands are way shorter.
Select-String -Path "app.conf" -Pattern "api"
sls api app.conf
Select-String -Path "app.conf" -Pattern "api" -NotMatch
sls api app.conf -NotMatch
Select-String -Path "app.conf" -Pattern "^\w*:"
sls "^\w*:" app.conf
Select-String -Path "*.py" -Pattern "super_secret_setting"
sls super_secret_setting *.py
Get-ChildItem -Recurse -Include "*.py" | Select-String -Pattern "super_secret_setting"
dir -Recurse -Include *.py | sls super_secret_setting
Taking It Further
This covers basic examples for replacing grep in PowerShell.
Thanks to PowerShells object oriented piping system the information we see on screen is not everything we get to work with.
Lets take a look with one on the previous examples
and plug it into the fictional pipeline of creating a report for all usages of super_secret_setting
.
Select-String -Path "*.py" -Pattern "super_secret_setting" | Select-Object Path,LineNumber | Export-Csv -Path api_conf.csv -NoTypeInformation
Obviously this is just a example demonstrating the many possibilities in having a rich object describing the match. You could write them to a variable for later use, loop over them or any other thing a real programming language allows you to do.
For quick reference this are all values accessible inside every MatchInfo object returned.
IgnoreCase : True
LineNumber : 2
Line : api_key_twitter: '***'
Filename : app.conf
Path : /Users/mkamner/projects/mkamner/blog/app.conf
Pattern : ^\w*:
Context :
Matches : {0}
The Context
value can be interesting to use, as it provides us with the possibility to get X lines before and after our matching line in the resulting object.
# 3 lines before and after
-Context 3
# 1 line before, 3 after
-Context 1,3
A great ressource to dive in even deeper is the official documentation from microsoft.
Feel free to ask me about your PowerShell problem over on Twitter!
This is a multi part series, read more.