Getting
to Know Your Network -- Part III
Luis Enrique Muñoz
As you may remember, in the previous parts of this series, I presented
a tool called aconfig that allows sys admins to execute combinations
of configuration commands and Perl code -- ascripts -- on network
devices. I showed how to use this to harvest the version and configuration
information from network devices, and then place the relevant parts
of this information into a relational database.
This results in a very nice inventory, which can be updated automatically
using cron. But this is not a complete view of your network, by
far. Networks interconnect many things like desktops, laptops, printers,
servers, etc. Let's call these things "endpoints". In many networks,
most endpoints have dynamic addressing, and some even move among
network segments, such as when a laptop is unplugged from one office
and plugged into another, or when a device associates itself with
a rogue access point.
Capturing Address Data
Our strategy in this article will be to "sample" the ARP tables
on all of the network devices to see what we can find out. This
will show us which endpoints have been talking to various devices
and tell us their physical address (i.e., MAC or Ethernet address),
as well as their current IP address. Astute or paranoid readers
will know that MAC addresses can be changed. One way to do this
is to change the network card. Another way is through software,
instructing the hardware to use a different address. In my experience,
this is only an issue in incidents where a malicious individual
is trying to wreak havoc in your network and, by the time that individual
has gained a position where he or she can spoof a MAC address, you
are in much deeper trouble.
We will undertake all of this with the help of a nice ascript,
aptly called ascripts/arp-capture, reproduced below:
$ cat ascripts/arp-capture
%INCLUDE ascripts/include/setup%
%INCLUDE ascripts/include/save-arp%
The juicy parts are in ascripts/include/save-arp, which is heavily
based on the includes written in the previous article, with the difference
that now we add a time stamp to the name of the file. This will come
in handy later if we want to do command-line manipulations with the
resulting files:
$ cat ascripts/include/save-arp
show arp
%{
package save::arp;
use POSIX qw(strftime);
use IO::Zlib;
my $fh = new IO::File './output/' . $main::ADDR . '-' .
strftime("%Y%m%d%H%M%S", localtime) . '.arp', "w";
die "Failed to create output file: $!\n"
unless $fh;
print $fh $main::LAST;
close $fh;
undef;
}%
We should set this to run every few hours under cron. Keep in mind
that for very large networks, you may want to split the list of devices
and run an instance of aconfig for each chunk. This allows for quick
parallelization of the task, with minimal impact on the network. To
run a single instance from the command line, we issue the now familiar
command:
$ ./aconfig -c ascripts/arp-capture aconfig.hosts
As this command runs, files with names like output/10.64.106.129-20051020114846.arp
will be created, with contents that look like:
Protocol Address Age (min) Hardware Addr Type Interface
Internet 10.2.237.101 163 0011.2537.e368 ARPA FastEthernet0
Internet 10.2.237.100 58 00e0.7d42.66ce ARPA FastEthernet0
Internet 10.2.237.103 151 0000.8642.f43e ARPA FastEthernet0
...
The next step is to add this information to the database. Each endpoint,
recognized by the MAC address, will be added to the "endpoint" table.
We will record which IP address was assigned at the time, through
the "assignment" table, which will also tell us in which subnet we
are seeing the endpoint. We will also record the sighting on the device
on a given interface through the "sighting" table.
To do this, we will use the "scripts/arp2db" script (Listing 1),
which parses the .arp files and adds data to our database. This
includes a nice touch: a mapping between MAC or Ethernet addresses
and device vendors, which helps you locate specific brands of network
cards. This is useful, again, to detect rogue access points or WiFi
network cards that are signaling either a rogue access point in
the subnet or interface where this card was seen, or a machine acting
as a bridge between its wireless card and the wired card. I will
not cover all the details of arp2db, as it is quite similar to config2db.
However, let's look at the main part of arp2db to get a better
understanding of how to use this database schema and how to extract
information into it. The main action happens in a loop between the
lines 83 and 156, where all the lines in the ".arp" file are read.
Line 86 splits the columns of each line. Lines 89 and 90 make sure
we skip lines that do not convey information we want, such as the
report headings, lines with a different structure, lines that contain
an incomplete ARP, etc.:
83: while (my $line = <>)
84: {
85: chomp $line;
86: my ($proto, $ip, $age, $mac, $class, $if) = \
split(/\s+/, $line, 6);
87: my $vendor = undef;
88:
89: next unless $age eq '-' or $age =~ /\d+/;
90: next if $mac eq 'Incomplete';
91:
...
156: }
Next, lines 92 and 93 normalize the format used to store the MAC address.
We remove separator symbols that may be used and convert the string
to lowercase. Note that this could also be done in the Class::DBI
-- derived classes were placed in MyConfigCDBI.pm, using the "deflate"
facilities provided by Class::DBI:
92: $mac =~ tr/-:. //ds;
93: $mac = lc $mac;
Line 95 converts the IP address associated with the entry we just
read from our input into a NetAddr::IP object for easy conversion.
Line 97 determines the correct device name to use in this case, based
on the filename of the file we're reading. This technique is simple
although fragile. Another, possibly more robust, approach would be
to have the ascript write the $main::ADDR variable to the .arp file
and then parse it in this script.
In any case, if it becomes impossible to determine which device
the entry is taken from, the entry is skipped at lines 99 through
103:
95: $ip = NetAddr::IP->new($ip);
97: my ($devif, $time) = get_device_from_file($ARGV);
99: unless ($devif)
100: {
101: warn "$ARGV cannot be mapped to an interface. Skipping\n";
102: next;
103: }
Lines 105-109 are responsible for finding a vendor code in the %ethers
translation table and printing a rudimentary progress indicator:
105: my $submac = substr($mac, 0, 6);
106: $vendor = $ethers{$submac} if exists $ethers{$submac};
107:
108: print join(' ', $ARGV, $devif->device, $devif->interface,
109: $if, $mac, $time), "\n";
Lines 111 through 121 ensure that we create the endpoint or update
the endpoint access time, so that it becomes possible to purge endpoints
that have not been seen on a long time. We can also detect fresh endpoints,
possible visitors plugging equipment in the corporate network, or
new devices:
111: my $ep = MyConfig::CDBI::Endpoint->find_or_create
112: (
113: endpoint => $mac,
114: ($vendor ? (vendor => $vendor) : ())
115: );
116:
117: unless ($ep->time)
118: {
119: $ep->time($time);
120: $ep->update;
121: }
Lines 123 through 129 record the sighting of the endpoint in the interface
if this record does not already exist:
123: my $sighting = MyConfig::CDBI::Sighting->find_or_create
124: (
125: endpoint => $ep->endpoint,
126: time => $time,
127: device => $devif->device,
128: interface => lc $if,
129: );
Lines 131 through 147 find which subnets contain the assigned IP address
of this endpoint. Normally, only one subnet should match. Warnings
are produced if no subnet in the database matches the endpoint, meaning
that there are devices we're not seeing. If more than one subnet matches,
it likely means that some devices have an incorrect netmask configured
in one of the interfaces:
131: my @subnet = MyConfig::CDBI::Subnet->search_where
132: ({
133: first => { '<=', scalar $ip->numeric },
134: last => { '>=', scalar $ip->numeric },
135: },
136: { logic => 'AND' }
137: );
138:
139: if (@subnet == 0)
140: {
141: warn "$mac ($ip) matches no known subnet(!)\n";
142: }
143: elsif (@subnet > 1)
144: {
145: warn "$mac ($ip) matches more than one subnet(!)\n";
146: warn ' ' . $_->cidr . "\n" for @subnet;
147: }
Finally, the assignment is recorded in the database:
149: my $assignment = MyConfig::CDBI::Assignment->find_or_create
150: (
151: ip => scalar $ip->numeric,
152: time => $time,
153: endpoint => $ep->endpoint,
154: (@subnet > 0 ? (cidr => $subnet[0]->cidr) : ()),
155: );
To run it, you could do something like this:
$ find ./output -type f -name '*-20051020??????.arp' \
| xargs ./scripts/arp2db
./output/10.64.106.129-20051020110654.arp CT-TC-ZUL fastethernet0 ...
./output/10.64.106.129-20051020110654.arp CT-TC-ZUL fastethernet0 ...
./output/10.64.106.129-20051020110654.arp CT-TC-ZUL fastethernet0 ...
./output/10.64.106.129-20051020110654.arp CT-TC-ZUL fastethernet0 ...
...
You can also use more complex incantations of the find command
to process the most recent ".arp" files, for example. You would also
need to clean up the directory, removing those files with more than
a few days of age. This keeps your space consumption relatively constant.
All this new information can give you a few new tricks. Let's
start with a pet peeve of mine and pretend that you would like to
find rogue or hidden access points in your network. Now you could
simply find vendors that are likely to represent a wireless network
card, such as:
sqlite> select count(*) from endpoint where vendor
...> like '%Linksys%' or vendor like '%D-link%'
...> or vendor like '%Netgear%';
15
Remember that you may not see the access point itself. Likely, it
will be set up as a bridge, which makes it invisible in layers 2 and
3 with normal techniques. But, in this case, you'll still see hosts
attached to the access point.
We could also ask where in our network we've seen those endpoints.
This might give us an idea of where to look, because every device
that has seen the endpoint recently should remember it in its ARP
table. But, don't celebrate just yet. Because multiple sightings
are kept, using the "time" column to record when they took place,
we'll filter by using a Perl one-liner to calculate the correct
Unix time for us:
$ perl -MDate::Parse -e 'print str2time("Oct 20, 2005"), "\n"'
1129780800
$ sqlite3 config.db
sqlite> select distinct s.device, s.interface from endpoint e,
...> sighting s where e.vendor like '%Linksys%' or vendor
...> like '%D-link%' or vendor like '%Netgear%' and
...> s.endpoint = e.endpoint and s.time > 1129780800;
AP-01-PB-ESTE-BLDG1|bvi1
SW-06-P3-BLDG1|vlan1
SW-07-P3-BLDG1|vlan1
SW-08-P3-BLDG1|vlan1
SW-09-P3-BLDG1|vlan1
SW-10-P3-BLDG1|vlan1
SW-11-P3-BLDG1|vlan1
...
In our case, we have a legitimate access point at AP-01-PB-ESTE-BLDG1,
because we got in there to get its configuration file; its name follows
our standard scheme and is within our inventory:
sqlite> select hardware, software from device where
...> device='AP-01-PB-ESTE-BLDG1';
cisco AIR-AP1231G-A-K9 ...|IOS (tm) C1200 Software (C1200-K9W7-M)
However, it seems that something fishy is going on at Building 1,
where our switches are picking up suspicious ARP entries on the third
floor (that's what P3-BLDG1 stands for). Just remember that we're
simply looking at the vendor code of the network card. A vendor can
make wireless and wired network cards that will share the same vendor
code, so further research is required. However, this tool gives you
a healthy start because it quickly tells you where to start looking.
Adding nmap
To improve our knowledge of the network, we will use nmap, an
excellent tool to perform network scanning, to probe the endpoints
we've found. I'll assume that you either have that tool installed
or know how to do it yourself. In this case, let's go ahead and
install Nmap::Scanner, a Perl module that allows us to drive nmap
from a Perl script:
$ perl -MCPAN -e shell
cpan> install Nmap::Scanner
...
After installing this module, we'll need a script to automate the
scanning for us. Let's call it scripts/nmap2db and, as usual, I'll
describe its most important parts. You can download the complete script
(Listing 2) from the Sys Admin Web http://www.sysadminmag.com:
12: use constant HOSTS => 100;
14: MyConfig::CDBI->connection('dbi:SQLite:dbname=config.db');
16: my @eps = MyConfig::CDBI::Endpoint->search_where(
17: { os => \ "IS NULL" },
18: );
At line 12, we tell the script how many addresses we want to scan
with nmap. This number should be small enough that it does not take
an unacceptable amount of time. With 100 hosts, it takes my laptop
about 8 minutes to do a scan. Line 14 sets up the connection to the
database, as customary. Lines 16 to 18 fetch the list of endpoints
that have a NULL os column:
20: my %list = ();
21: for my $ep (@eps)
22: {
23: my @addr = MyConfig::CDBI::Assignment->search_where(
24: { endpoint => { '==' => $ep->endpoint } },
25: { order_by => 'time DESC' }
26: );
27: next unless @addr;
28: $list{NetAddr::IP->new($addr[0]->ip)->addr} = $ep;
29: last if keys %list == HOSTS;
30: }
Lines 20 to 30 build a list that links each endpoint found previously
with its most recent IP assignment. We use the search_where method
provided by Class::DBI::AbstractSearch in MyConfigCDBI.pm:
32: my $scanner = new Nmap::Scanner;
33: $scanner->tcp_syn_scan(1);
34: $scanner->add_scan_port('80,25,135,137,139,445,443,110,113,53');
35: $scanner->guess_os(1);
36: $scanner->max_rtt_timeout(200);
37: $scanner->add_target($_) for keys %list;
38: my $r = $scanner->scan;
Lines 32 to 37 configure our scanner. You may want to verify these
parameters according to your own network, especially for the list
of ports to scan. Line 38 starts the scan for all the hosts found
in previous steps:
39: my $hl = $r->get_host_list();
41: while (my $h = $hl->get_next)
42: {
43: my $os = join(', ',
44: map { join '/', grep { defined $_ } $_->type, $_->vendor,
45: $_->osgen, $_->osfamily, $_->accuracy . '%' }
46: $h->os->osclasses);
47: for my $a ($h->addresses)
48: {
49: next unless exists $list{$a->addr};
50: $list{$a->addr}->os($os);
51: $list{$a->addr}->update;
52: last;
53: }
54: }
Line 39 gets the list of scanned hosts, which is iterated through
at lines 41 to 54. These lines simply compose a very verbose value
for the os column and update the endpoint entry in the database.
With this, you could leave this script running within a loop like
the following. Note that you will need root privileges so that nmap
can craft the packets it needs for guessing the remote operating
system of the device:
$ su -
Password:
# while true; do ./scripts/nmap2db; sleep 10; done
Note that the endpoints are always returned in the same order. Eventually,
all the returned endpoints will be the ones that do not respond to
pings or that cannot be scanned. This means that periodically, you
will need to erase the old, unreachable endpoints. That is why the
time column is there.
A very simple workaround for this is to throw in a Schwartzian
Transform at line 21, so that it reads:
21: for my $ep (map { $_->[1] } sort { $a->[0] <=> $b->[0] }
map { [rand, $_] } @eps)
With this change, the returned rows will be in random order, minimizing
the issue.
Conclusion
Now that we have these tools at our disposal, we could write much
more interesting queries that will tell us with greater precision
what is the potential impact of a vulnerability. In the next part
of this article, I will show how to put all this information to
use as we generate maps and diagrams of the network.
Luis has been working in various areas of computer science
since the late 1980s. Some people blame him for conspiring to bring
the Internet into his home country, where currently he spends most
of his time teaching others about Perl and taking care of network
security at the largest ISP there as its CISO. He also believes
that being a sys admin is supposed to be fun. |