8.07.2011

How to find all the distinct PHP session variables that your applications uses.

The short answer is a one line *nix shell command:

grep -hor "\$_SESSION\['[A-Za-z0-9_]*'\]" * | sort -u

This command: 
  • looks through all the files in your application recursively for the PHP $_SESSION reference
  • finds any variables named with capital letters, or lowercase letters, or underscores, or numbers
  • sorts the list alphabetically 
  • removes the duplicate items in the list

The long answer is that I found myself with an interesting dilemma recently.  How to find all the PHP session variables set in my application.  Some pieces of the application were new code, some were older legacy code.  I needed to get the full list of session variables because I needed to delete most, but not all of them for a certain usage case ( ie, the user is still logged in and has some properties, but the other session data could be safely destroyed ).

So I began with some command line greps on linux.  First I tried:

grep -r \$_SESSION *

This was  a decent list, but about 1000 rows long.  Too unwieldy to deal with.  Let's get rid of the filenames, I don't really care where the session values are set for my case.

grep -hr \$_SESSION *

This is a little better, but I don't need the whole line, just the session variable itself.  Let's see if we can start to grab the session var using a regex pattern.

grep -hor "\$_SESSION\[" *

That's getting us closer.  Now, I know ( or assume ) that my variables will only include upper/lower case letters as well as numbers and underscore.  Let's add those to the pattern.  ( you DO speak regex don't you? )

grep -hor "\$_SESSION\['[A-Za-z0-9_]*'\]" * 

Now we are cooking.  This is a nice list of the session variables ( albeit only one array level deep which is all I needed ).  Now how to remove the duplicates?  Maybe we should "sort" them first?

grep -hor "\$_SESSION\['[A-Za-z0-9_]*'\]" * | sort 

That is really close.  Is there a way to remove duplicates with the "sort" command?  Yes, there is.  Hot dog.

grep -hor "\$_SESSION\['[A-Za-z0-9_]*'\]" * | sort -u

There it is.  That's the final command I used which located around 50 variables in the old legacy code and new modular code that were used in the sessions.  The only really drawback to this code is it will not find multiple nested array values on the session itself, but you could add that as a separate regex if you need.

2 comments:

Christopher Scott said...

ooh didn't know about "sort -u".. haha, all this time i've been piping to "uniq" like an idiot... cool stuff!

Here's one similar for ack users:
ack "\\$\_GET\[[^\s]*\]" -ho | sort -u

notice the extra backslash, i guess 'cause it's perl .. and the negated character class instead of the explicit set like you showed. A lot harder to read :)

Rich Zygler said...

I always forget about uniq.

Post a Comment