IP Address Obfuscation and modification with sed

I’m working on a logfile analyzer for a honeypot. One of the things
I’m interested in is copying the report files to a public website so
that others can see it too.

But, there are some potential privacy issues involved. Since I’m doing
an analysis on where the attacks are coming from, and reporting on them,
I don’t think I want to share the exact IP address that the ssh probes
came from. So how do I do that? Well, I could use Perl, but the analyzer
is a “Big-Ass Shell Script” so I want to minimize how often I run Perl.
The IP addresses are hidden in other lines of text so I can’t use Awk (That
would be too simple. So I have to use Sed.

For the record, I’m running this on Linux, Fedora Core 20 to be exact.
Thinking this would be easy was a mistake. It took me almost an hour
of mucking about before I had a working sed expression. Once I figured
out I NEEDED to use the “-r” option (which is “use extended regular
expressions in the script”) then things finally started falling into
place.

The following sed expression returns the IP address just as it came in.


echo 92.168.133.1 |\
sed -r ‘s/([0-9]{1,3}\.)([0-9]{1,3}\.)([0-9]{1,3}\.)([0-9]{1,3})/\1\2\3\4/’

And the output is

92.168.133.1

And THIS sed expression replaces the second octed with the word “FOO”.
Please note the “.” after the word “FOO”. That’s part of what gets
substituted in.


echo 92.168.133.1 |\
sed -r ‘s/([0-9]{1,3}\.)([0-9]{1,3}\.)([0-9]{1,3}\.)([0-9]{1,3})/\1FOO.\3\4/’

And the output is

92.FOO.133.1

And in this example, I am reversing the IP address.


echo 92.168.133.1 |\
sed -r ‘s/([0-9]{1,3}\.)([0-9]{1,3}\.)([0-9]{1,3}\.)([0-9]{1,3})/\4\3\2\1/’

And the output is (Please note that we have a trailing “.” at the end of the
IP address. I leave removing the trailing “.” as an exercise for the reader.)

1.133.168.92.

Soooooo, What’s the sed expression really doing? Let’s break it apart
into different lines so it’s easier to understand.

Start sed using extended regular expressions


sed -r

Single quote to start the expression, and “s” says to do a substitution.

‘s

The start of the search expression.

/

This is the first “remembered” pattern. The open parenthesis and the
close parenthesis mark the start and end of the remembered pattern.
The “[0-9]” means all the characters between 0 and 9. The “{1,3}”
means the PRIOR pattern 1, 2, or 3 times only. This means “x” doesn’t
match, but “1”, “11”, and “111” match. The “\.” means literally a single
period. It’s backslashed to mean a period. Without the backslash, a
single period means “match any single character”.

([0-9]{1,3}\.)

This is the second “remembered” pattern.

([0-9]{1,3}\.)

This is the third “remembered” pattern.

([0-9]{1,3}\.)

This is the fourth “remembered” pattern. Please note there is NO trailing
“.” character.

([0-9]{1,3})/

Print the first “remembered” pattern.

\1

Print the second “remembered” pattern.

\2

Print the third “remembered” pattern.

\3

Print the fourth “remembered” pattern.

\4

And finally, a final backslash and a single quote to show the end of
the sed expression.

/’

So what can we do with this? In the expression, instead of printing
all four remembered patterns, we can print other things by replacing
the “\#” with something else. So instead of “\1\2\3\4”, we could have
“\1\2\3127” which would print out 92.168.133.127. Patterns are
SINGLE digits(1 through 9), so \3127 doesn’t mean the 3,127th pattern, but means
print the third pattern (\3), followed by the other text.

What are the problems with this expression? Well, it doesn’t explicitly
deal with true IP addresses. A true IP address goes from 0.0.0.0 to
255.255.255.255. This pattern I made goes from 0.0.0.0 to 999.999.999.999.
For what I need to do, this is close enough.

Now, I need to obfuscate URLs. The same deal applies.


echo “http://www.foo/x” |sed -r ‘s/(http:\/\/..+)(.+)/http:\/\/HIDDEN\/\2/’

returns

http://HIDDEN/x

And I did it again with FTP.


echo “ftp://www.foo/x” |sed -r ‘s/(ftp:\/\/..+)(.+)/ftp:\/\/HIDDEN\/\2/’

returns

ftp://HIDDEN/x

Thanks to http://www.grymoire.com/Unix/Sed.html which was a great
help in remembering how to use sed.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s