Article

Getting to Know Your Network -- Part IV

Luis Muñoz

Previously in this series, I presented aconfig, a tool that allows the execution of configuration commands mixed with Perl in our network devices. I showed how to use this tool to extract information about the network topology and configuration and store it into a database for simplified querying and reporting. This, in itself, is a valuable addition to incident response and vulnerability management processes, which eases the task of determining the significance of daily threats to our network.

In this final article of the series, I will show how to put those tools to work in a slightly different way, by creating visual representations of our network. These representations are very useful because they make it much easier to understand how the different network elements are connected.

The kind of network diagram we want can be thought of as an undirected graph. As it happens, there is an excellent tool, called GraphViz, that allows the description of those graphs. That same tool can then lay out the nodes, representing devices, and the edges, representing the different connections among devices. To learn more about GraphViz, you can take a look at its official page:

http://www.research.att.com/sw/tools/graphviz/

If you have not already installed GraphViz, go ahead and do so.

Because it is very popular, GraphViz has a number of add-on tools that process data and produce the graph descriptions it requires. Following this trend, we need a tool to produce that description from the database we created earlier in our article. I will present this tool, which is called "db2dot" (see Listing 1). The idea is to pass the db2dot script some parameters that we'll use to query the network database and produce the "dot file", which is the description of the graph that GraphViz can produce for us.

For simplicity, we'll be using a Perl interface to GraphViz, which will automatically generate the dot file through a few simple commands. To install this interface, let's use the CPAN to install the required module, as follows:

# perl -MCPAN -e shell
cpan> install GraphViz
...

Our script will essentially perform the various classes of queries that we need to find devices, interfaces, subnets, endpoints, network assignments, and device sightings from the database. The retrieved data (in the form of objects, thanks to the abstraction provided by Class::DBI) will enclose all the relevant information. These objects will be kept in hashes as the querying proceeds, to keep a single copy of each object in memory. After this process is completed, the hashes will be converted to GraphViz directives that will, in turn, produce the expected dot file.

Let's take a look at the relevant parts of the script. As usual, you can download the complete listings (db2dot and nodes.pl) from the Sys Admin Web site at: http://www.sysadminmag.com.

Lines 47 and 48 of the db2dot script set up both our GraphViz object and the connection to the database. Note that the data for initialization of GraphViz comes into an array that is defined in another file. This allows for the separation of the formatting information from the actual code, making it smaller and more compact:

47: my $g = GraphViz->new(%graph_options);
48: MyConfig::CDBI->connection('dbi:SQLite:dbname=config.db');

On lines 87 to 102, we verify whether there's a condition to match network devices (given with the -d option). In this case, we retrieve all network devices using the ->retrieve_all method provided by Class::DBI, so that we can apply Perl's regular expressions to the device names.

In networks with many devices or when your database is very slow, you might want to give up the power provided by the Perl regular expressions, replacing the call to ->retrieve_all with the use of ->search_like, which uses SQL's LIKE clause.

Lines 95 through 100 take care of fetching the interfaces on the device and keep copies if they match the specifications given with -i and -I. When no specifications are given, all interfaces must be included.

Note the use of various _store_* functions through the code. Those are utility functions that simply store the passed value into the appropriate hash. We will also use a function called _ifname, which returns the name of the corresponding interface. This could also be achieved through the serialization method of classes such as MyConfig::CDBI::Interface or providing specialized functions at the corresponding classes. I chose this approach, because the name used is only seen within the script and represents no common convention:

 87: if ($opt_d)
 88: {
 89:     for my $dev (MyConfig::CDBI::Device->retrieve_all)
 90:     {
 91:         next unless $dev->device =~ m/$opt_d/i;
 92:         if ($opt_D) { next if $dev->device =~ m/$opt_D/i }
 93:         _store_dev($dev);
 94:
 95:         for my $ifs ($dev->interfaces)
 96:         {
 97:             if ($opt_i) { next unless $ifs->interface =~ m/$opt_i/i }
 98:             if ($opt_I) { next if $ifs->interface =~ m/$opt_I/i }
 99:             _store_int($ifs, $opt_3);
100:         }
101:     }
102: }

Lines 104 through 134 do a similar thing for the endpoints. Endpoints can be filtered by the use of -e, -E, -s, -S, -v, or -V to match the endpoint address, operating system guess, or address vendor code, respectively. The lowercase version of the option is a positive match, which is true when the match succeeds. The uppercase version of the option is negative, being true if the match fails:

104: if ($opt_e or $opt_s or $opt_v)
105: {
106:   for my $ep (MyConfig::CDBI::Endpoint->retrieve_all)
107:   {
108:     if ($opt_e) { next unless $ep->endpoint =~ m/$opt_e/i }
109:     if ($opt_s) { next unless $ep->os and $ep->os =~ m/$opt_s/i }
110:     if ($opt_v) { next unless $ep->vendor and $ep->vendor =~ m/$opt_v/i }
111:     if ($opt_E) { next if $ep->endpoint =~ m/$opt_E/i }
112:     if ($opt_S) { next if $ep->os and $ep->os =~ m/$opt_S/i }
113:     if ($opt_V) { next if $ep->vendor and $ep->vendor =~ m/$opt_V/i }
114:
115:     _store_ep($ep);

Lines 117 to 126 go through the sightings of each endpoint in the known interfaces within our database. For each of those, we make sure that the device and the interface are stored in the local hashes for representation in the resulting graph.

Note that we fetch the sightings ordered by time in descending order. Then we build our %sightings hash so that the first occurrence of a sighting for this device in this interface -- the most recent sighting -- records the associated time. Then, we count the number of sightings in that interface. With nomad endpoints, this will produce a graph with an edge for each interface where the endpoint has been seen, whose label will include the number of times seen there and the timestamp of the latest sighting:

117:   for my $sight ($ep->all_sightings({order_by => 'time DESC'}))
118:   {
119:     $sightings{$ep}->{_ifname $sight} ||= [ $sight, 0 ];
120:     $sightings{$ep}->{_ifname $sight}->[1]++;
121:     _store_dev($sight->device);
122:     _store_int(MyConfig::CDBI::Interface->retrieve(
123:                    interface => $sight->interface,
124:                    device => $sight->device,
125:                ), 0);
126:   }

Lines 128 through 134 iterate through the address assignments for each endpoint in a similar way to the sightings:

128:   for my $assign ($ep->all_assignments({order_by => 'time DESC'}))
129:   {
130:      $assignments{$ep}->{$assign->ip} ||= [ $assign, 0 ];
131:      $assignments{$ep}->{$assign->ip}->[1]++;
132:   }

Lines 136 to 162 verify and fetch the required subnets from the database, unless db2dot has been instructed to skip subnets with -n or -N. This process helps detect equipment that is not being seen. For instance, having an ARP entry for an endpoint in a unknown subnet might indicate that the router supporting this subnet is not being explored by our scripts. This might also indicate more serious issues, but in any case this will give you a warning.

Lines 140 through 150 iterate through all the interfaces found earlier in our script, stored in the %int hash. For each, the subnet is fetched from the database, or a warn() is sent to alert you of the problem:

140:  for my $if (keys %int)
141:  {
142:     for my $a (@{$int{$if}->{addr}})
143:     {
144:       my $sn = $a->cidr;
145:       if ($sn) { $subnets{$sn->cidr} = $sn }
146:       else {
147:         warn "Missing or unknown subnet from $if...\n";
148:       }
149:     }
150:  }

Lines 153 to 162 iterate through the assignments for all endpoints in the %assignments hash. For each assignment, the corresponding subnet is fetched from the database. Again, lines 150 and 160 emit a warning if such subnet does not exist.

This code might be faster if, instead of fetching from the database, we used a hash as a cache for the subnets. This could easily be performed with Tie::NetAddr::IP from CPAN:

153:  for my $ep (keys %assignments)
154:  {
155:     for my $ip (keys %{$assignments{$ep}})
156:     {
157:        my $sn = $assignments{$ep}->{$ip}->[0]->cidr;
158:        if ($sn) { $subnets{$sn->cidr} = $sn }
159:        else { warn "Missing or unknown subnet for ",
160:               $assignments{$ep}->{$ip}->[0], "...\n" }
161:     }
162:  }

Finally, lines 164 and 165 emit the edges between the node representing either an interface or an endpoint, and the corresponding subnet. Note that the parameters with which we create the node are again stored in a hash called %subnet_node. This allows for the separation between the code in db2dot and the presentation:

164:     $g->add_node($_, label => $_, %subnet_node)
165:         for keys %subnets;

Lines 168 through 222 iterate through the different hashes we've built and emit the corresponding nodes or edges using GraphViz methods, as seen before. Lines 191 to 207 are an example of this.

Next, the script iterates through all the found endpoints in the %eps hash. For each one, its corresponding node is added with ->add_node at lines 193 and 194. Again, the presentation data is in the %endpoint_node hash.

Then the script iterates through all the sightings of this endpoint and adds the corresponding edges between the endpoint and the device's interface, using the ->add_edge method of the GraphViz module at lines 200 to 204. The presentation configuration is stored in the %sighting_edge hash:

191: for my $ep (keys %eps)
192: {
193:     $g->add_node($ep, label => $eps{$ep}->endpoint . "\n"
194:                  . $eps{$ep}->vendor, %endpoint_node);
195:     if (exists $sightings{$ep})
196:     {
197:         my $sight = $sightings{$ep};
198:         for my $if (keys %$sight)
199:         {
200:             $g->add_edge(
201:                 $ep => $if,
202:                 label => $sight->{$if}->[1] . "x "
203:                 . scalar(localtime($sight->{$if}->[0]->time)),
204:                 %sighting_edge
205:             );
206:         }
207:     }

Finally, line 224 emits the resulting dot file without attempting to layout the graph. This task will be accomplished at a later stage, depending on what we want to do with the resulting graphs:

224: print $g->as_debug;

To use this tool, we must supply the specifications for what we want to see in the diagram. Let's say that we want to see endpoints in our network that run some variant of Windows with a D-Link network card. The following would do the trick:

$ ./scripts/db2dot -N -v '\Wd-link\W' -s '\Wwindows\W' > map.dot

The \W part matches a non-word in a Perl regular expression, which prevents unwanted matches. We can match the operating system guessed by nmap using the -s option, and the network card vendor with the -v option. After some very light tweaking in GraphViz for Mac OS X, the resulting diagram looks like Figure 1.

Remember the arrays that contain the parameters for the different edges and nodes? Those are used to specify a nice picture to use for each node type. You can tweak those values and adapt them to your tastes using any of the filters included in GraphViz to translate the diagram to any kind of supported file. For instance, you could transfer the diagram to Postscript with the command:

$ circo -Tps map.dot > map.ps

In Figure 1, you can see two machines matching our criteria. The Ethernet addresses are 0011.95de.ad.11 and 0011.95be.ef10. The first one was seen connected to Vlan1 on corporate router ALL-YOUR-BASE on November 11 and connected to two corporate wireless access points on October 21. On all four occasions, the host has had the same IP address, 10.64.170.22, also at Vlan1. The second machine has been seen twice on Vlan234 on the same router, both times with IP address 10.64.190.210.

Let's return to our vulnerability management scenario and pretend that we have learned about a new exploit that affects machines running various versions of SCO Unix through a given network service. In this case, our first action would likely be to deploy ACLs in our routers. But at which routers? And how many systems will we have to patch? We can find out with a command like:

$ ./scripts/db2dot -N -s '\Wsco\W' > sco.dot

This command produces a diagram that again, with a little tweaking, looks like Figure 2. In this case, we know that only three endpoints match the given description and that we can apply ACLs at two specific routers besides the edge of the network to further reduce the likelihood of exploitation while patches are tested and installed.

There are many other ways to view your database than the ones shown in this article. I recommend experimenting with these tools to see whether they fit in your processes. Examples of other useful additions to the scripts I've described include the recognition of endpoints and devices that are actually the same network node, or querying and producing diagrams based on IP addressing. I hope you've found the information I've provided in this series to be useful. If you have suggestions or comments, please let me know.

Luis has been working in various areas of computer science since the late 1980s. Some people blame him for conspiring to bring the Internet into his home country, where currently he spends most of his time teaching others about Perl and taking care of network security at the largest ISP there as its CISO. He also believes that being a sys admin is supposed to be fun.