Sydney S. Weinstein, CDP, CCP is a consultant, columnist, author, and president of Datacomp Systems, Inc., a consulting and contract programming firm specializing in databases, data presentation and windowing, transaction processing, networking, testing and test suites, and device management for UNIX and MS-DOS. He can be contacted care of Datacomp Systems, Inc., 3837 Byron Road, Huntingdon Valley, PA 19006-2320 or via electronic mail on the Internet/Usenet mailbox syd@DSI.COM (dsinc!syd for those who cannot do Internet addressing).
This is my third anniversary in writing this column, and its time to update the prior January columns. In January 1990 (CUJ Vol. 8, No. 1), I wrote about the internet, the Internet, USENET, and Network News. In January 1991 (CUJ Vol. 9, No. 2), I wrote about obtaining USENET Network News. In January 1992 (CUJ Vol. 10, No. 2), I wrote about obtaining the sources referred to in this column. This year, I am going to present a quick review and update to all three past columns. I do not intend to provide an in-depth review, so newcomers to USENET might want to try and seek out back issues of those articles.
"On The Networks" covers articles posted to several of the source groups on USENET Network News: comp. sources.games, comp.sources.misc, comp.sources. reviewed, comp.sources.unix, comp.sources.x, and alt.sources. Each of these groups is like a section of a large electronic magazine called USENET Network News. I call it a magazine, and not a bulletin board partly because unlike a bulletin board, where each reader accesses a central machine to read the messages, USENET Network News is delivered on a subscription basis to each computer, and the articles are read locally.
I try and limit my coverage to the highlights of the articles posted to each group. In addition, I also generally restrict my coverage to freely-distributable C and C++ sources. Freely-distributable means that you can freely (and for free) make copies of the software, use it as you see fit, and give it away as you desire. It does not mean the software is in the public domain. Most of the software is copyrighted. This means you cannot pretend you wrote it, or include code from it in a product you are selling. However, the authors have allowed you to use and distribute it for free. If you make changes, most authors do not let you call the changed version by the name of the original. This is to avoid confusion as to what is and is not part of the product and to reduce the authors' support headaches.
Internet Update
First an update. USENET, often times referred to as "the net" is a loose collection of cooperating computers. In the long-distant past, all USENET computers ran UNIX, but now they could be running anything from MS/DOS to VAX/VMS. It also used to be that to be considered a computer on USENET, you communicated via the UNIX to UNIX Communications Protocol (UUCP) to another computer. That also has changed. Now there are many protocols in use. So, to be considered a member of the USENET network, one must be able to exchange Electronic Mail with other computers on the USENET network.If your computer is part of a network, and that network has a gateway to any other network, you are considered on an internet, short for inter-network. Note this internet is spelled with a lower case i. The Internet (capital I) is the computer network whose addressing (name space) is loosely managed by the DDN Network Information Center in Chantilly, VA. It is a collection of networks that grew out of the Defense Department's ARPANET (Advanced Projects Research Agency Network).
The Internet now includes not only the MILNET and NSFnet member networks (direct descendents of the ARPANET), but also several commercial TCP/IP networks which are members of the Commercial Internet Exchange (CIX). The Internet is mostly a set of networks with leased lines permanently connecting them to regional and backbone networks. Some outlying networks use gateways with dial-up links to reach their regional network, but most links are 56K, 1.544M, or 45M bits/second leased lines (that's megabits per second).
Whereas only mail and news is usually available over the USENET via UUCP, the Internet runs the TCP/IP protocol and supports news (NNTP, Network News Transfer Protocol), mail (SMTP, Simple Mail Transfer Protocol), Wide Area Information Service (WAIS), remote access lookup (archie), remote logins to any computer on the network provided you have an account there (telnet), and remote file transfer (FTP, file transfer protocol) in addition to many other services. All of these services coexist and work in real time.
Network News Update
The sources mentioned in this column are released via several groups distributed as part of USENET Network News. USENET Network News is both a method of distributing information between a very large group of computers, and a somewhat organized collection of that information into news groups that are distributed via the news software.First, the software. There are now two current Network News transport software suites: the older, C News, and a newcomer, INN.
The current version of the traditional Network News transport software is named C News, not because it is written in C but because it follows A News (previously called News) and B News as the third rewrite of the transport software. It supports transfer of the news articles (the individual messages) between every member of the USENET network. The C News software recently had a performance-enhancement release, and a "cleanup" release is expected shortly. C News is best suited for smaller sites that mostly have UUCP feeds, and feed a limited set of neighbors.
New for the major backbone sites, especially those with Internet connections, is Internet Network News (INN) from Rich Salz. INN is largely responsible in cutting down the time it takes for an article to be propagated throughout the backbone networks. Where C news uses batching to save articles for distribution in bulk, INN uses an immediate transfer to its NNTP (TCP/IP-based network news protocol) neighbors. Thus an article now reaches most of the backbone and regional network sites in only 1-5 minutes. (Just three years ago it took close to a day). INN is relatively new, and takes advantage of the decline in the price of RAM. It boosts performance by keeping everything it can in memory.
Second, there is the volume. At this time last year, the average was twenty-five megabytes per day. Now it is up to approximately forty megabytes per day. The number of newsgroups is also growing. Currently there are over 2,400 of them.
Lastly, there is a new distribution method, CD-ROM. Sterling Software, 1404 Fort Crook Road S., Bellevue, NE 68005, publishes a set of ISO-9660 format CD-ROMs with almost all the groups of USENET Network News. The CD-ROMs come out when a new one fills up, which is about every 10-14 days. They even include a modified news reader that will access the information as stored on the CD-ROM. Sterling can be reached at (800) 643-NEWS, or (402) 291-2108 from outside of the US. A subscription runs about $350/year + shipping. Shipping varies from $60/year in the US up to $200/year for some overseas areas.
How to Use This Column
This column normally reports on articles in the comp.sources and alt.sources sub hierarchies. All of the groups I currently report on but alt.sources are moderated. For the moderated groups, the authors of the software submit their packages to the moderator for posting. Each group has its own rules for acceptance. alt.sources is unmoderated and a free-for-all. Sources in that group are posted directly by the author.For the moderated groups, when I list a package, I provide five pieces of information for each package. The Volume, Issue(s), archive name, the contributor's name, and electronic mail address. The Volume and Issue are specifically named in the listing. The archive name is in italics, and the contributor's name is followed by an electronic mail address, in < >'s.
To locate a package via WAIS or archie, use the archive name. This is the short one-word name in italics given with each listing. To find the file at an archive site, use the group name (from the section of the column you are reading I place all listings for each group together in the column), the volume and the archive name. Most archive sites store the postings as group/volume/archive-name. The issue numbers tells you in how many parts the package was split when posted. This way you can be sure to get all of the parts.
In addition, I report on patches to prior postings. These patches also include the volume, issue(s), archive name, the contributors name, and electronic-mail address. Patches are stored differently by different archive sites. Some store them along with the original volume/archive name of the master posting. Some store them by the volume/archive name of the patch itself. The archive name listed is the same for both the patch and the original posting.
alt.sources, being unmoderated, does not have volume and issue numbers. So I report on the date in the Date: header of the posting and the number of parts in which it appeared. If the posting was assigned an archive-name by the contributor, I also report on that archive name. Archive sites for alt.sources are harder to find, but they usually store things by the archive name.
How to Find an Archive Site
With that much information flowing into each site every day, most sites cannot keep the information on their local disks for very long, usually only a couple of days. So, by the time you read my articles, the sources have been deleted from the machines in the network to make room for newer articles. So, many sites have agreed to archive the source groups and keep these archives for several years.The problem then is finding out which sites archive which groups, and how to access these archives. I again refer to the articles by Jonathan I. Kames of the Massachusetts Institute of Technology posted to comp.sources.wanted and news.answers. These articles, appear weekly and explain how to find sources.
As a quick review, here are the steps:
1. Figure out in what group, Volume, and Issue(s) the posting appeared. Also try and determine its archive name. If you know these items, it's usually easy to find an archive site that keeps that group. Most archive sites keep their information in a hierarchy ordered first on the group, then on the volume and last on the archive name. These together usually make up a directory path, as in comp.sources.unix/volume22/elm2 .3. In that directory you will find all of the articles that made up the 2.3 release of the Elm Mail User Agent that was posted in Volume 22 of the comp.sources.unix newsgroup. If you do not know the archive name, but do know the volume, each volume also has an Index file that can be retrieved and read to determining the archive name. One common publicly accessible archive site for each of the moderated groups in this article is UUNET.
2. If you do not know which sites archive the groups, or even if any site is archiving that item, but they are not archiving the entire group, consult Archie. (CUJ August 1991, Vol. 9, No. 8). Archie is a mail-response program that tries to keep track of sites reachable via FTP (file transfer protocol, a TCP/IP protocol used by internet sites) that have sources available for distribution. Even if you cannot access the archive site directly via FTP, it is worth knowing that the archive site exists because there are other ways of retrieving sources only available via FTP.
3. If you know the name of the program, but do not know to what group it was posted, try using Archie and search based on the name. Since most sites store the archives by group and volume, the information returned will tell you what newsgroup and volume it was posted in. Then you can retrieve the item from any archive site for that newsgroup.
4. If you do not even know the name, but know you are looking for source code that does xxx, retrieve the indexes for each of the newsgroups and see if any of the entries (usually listed as the archive name and a short description of the function) look reasonable. If so, try those. Or, make a query to archie based on some keywords from the function of the software and perhaps it can find items that match.
5. Next you have to determine what access methods the archive machine allows to retrieve software. Most archive sites are internet-based, and support the FTP service. If you have access to FTP on the internet, this is the easiest and fastest way of retrieving the sources. If you don't, perhaps a local college is a member of the internet and can assist you in using FTP to retrieve the sources you need.
Other sites support anonymous UUCP access. This is access via the UUCP protocol where you don't need to register in advance to call in. UUNET Communications Services supplies this using the (900) GOT-SRCS number at a per-minute charge for nonsubscribers. Many other archive sites provide it for just the cost of your long-distance telephone call. If you cannot use FTP, this is the next best method to use. Many anonymous UUCP archive sites list what they carry in the Nixpub listing of public access UNIX sites maintained by Phil Eschallier. The Nixpub listing is posted in comp.misc and alt.bbs periodically. If you don't get News and need a copy, it can be retrieved via electronic mail using the periodic-posting mail-based archive "server." This is run by MIT on the system pit-manager.mit.edu. To use the server, send an electronic mail message to the address mail-server@pitmanager.mit.edu with the subject help and it will reply with instructions on how to use the server. If you are not on USENET, sending electronic mail to sites on the internet is also possible via the commercial mail services. CompuServe, MCI-Mail, ATT-Mail, and Sprint-Mail all support sending messages to Internet addresses. Contact the support personnel at the commercial mail service you use for details on how to send messages to Internet addresses.
The last way to access the sources is via electronic mail. Several sites also make their archives available via automated mail-response servers. Note, this can be a very expensive way of accessing the information, and due to the load it places on the networks, most archive servers heavily restrict the amount of information they will send each day. The most commonly used mail-response FTP server is ftpmail@decwrl.dec.com. Digital Equipment Corporation runs a mail-based archive server that will retrieve sources via FTP and then mail them to you. To find out how to use this service, send it a message with the word help in the body.
CD-ROM Archives Available
Also available, are CD-ROMs with the archives of sources posted via USENET and from other sources as well. Two of the larger publishers are Walnut Creek CD-ROM and Prime Time Freeware.Walnut Creek CD-ROM, 1547 Palos Verdes Mall, Suite 260, Walnut Creek, CA (800) 786-9907 or (510) 947-5996 publishes several CD-ROMs each year. Some contain the Simtel20 MS-DOS Archive, others the X and GNU archives, and still others MS-Windows sources and other collections of sources and binaries. Disks run from $25 to $60 each (varies by title) plus shipping. In addition, the offer those hard to find CD-caddys at reasonable prices.
Prime Time Freeware, 370 Altair Way, Suite 150, Sunnyvale, CA 94086, (408) 738-4832, FAX: (408) 738-2050, <ptf@cfcl.com>, publishes twice a year a collection of freely distributable source code, including the complete USENET archives. Their disks run about $60 each set plus shipping. The latest issue, August 1992, has over 3MB of source code spread over two disks. They also offer a standing subscription plan at a discount.
Hopefully, this special edition of my column has given you a hint as to how to read my column and track down the sources. Note, I have been asked many times if I can make floppies or tapes containing the software mentioned in my column. I cannot spare the time to do this. I also have to work (and teach) for a living. If I started doing this, I could easily spend all my time trying to fulfill the requests and never get any of my work done.
However, what I have offered to do in the past, and am still willing to do, is provide a list of USENET sites in your area code. Send me a self-addressed, stamped envelope (my address is in the bio attached to this column). Those living in major metropolitan areas, please include two stamps on your letter. Note: I can only offer this service for US area codes. If you have net access, but need a news neighbor, I will also reply to Electronic Mail asking for nearby news sites.