Columbia University information, such as course descriptions, faculty listings, addresses, news and other communications, usually resides on departmental websites--and while some information may be printed or stored in other formats--the bulk has been disseminated solely through the Internet during the past ten years.

During the transition from print to web-only course catalogues in the early 2000s, administrators and librarians were concerned over the risk of losing this information without a system in place to capture and preserve Columbia University’s web content.

The Internet Archive offered a solution with its WayBack Machine, launched in 2001, allowing the public to search and access its saved websites. Beginning in 2006, Internet Archive (IA) began to offer a subscription web archiving service, Archive-It. Columbia University enrolled in this service beginning in 2010, although IA crawled many Columbia University websites prior to the University’s subscription.

If you know a website’s URL, simply enter it into the WayBack Machine. For example, searching the Program in Physical Therapy website, will return a timeline and number of captures. You can click the date to view the saved webpage:

Wayback Machine Calendar screen shot


The capture only extends back to 2019 since the URL changed mid-year. One way around the problem of constantly changing URLs is to search Columbia Libraries Archive-It site, which indexes according to “creator” and “collector,” with the Augustus C. Long Library listed under collector University Archives .

This second listed site was captured during 2013-2018:

PT Search results in Wayback machine

You can go directly to the old websites from this list.

With the old URL, you can also plug it into the WayBackMachine to find if any were captured before 2013—which it does, going back to 2002:

Old PT URL search

In this way, the Archive-it service is an important, additional tool in searching for past websites. To find an earliest captured webpage, it is important to know that the Internet Archive has been crawling and saving since 1996 and has been crawling websites with the domain since 2010.

With this in mind, another strategy is to search and see how far you can navigate within the archived site:  

Old CU Homepage screengrab

For example, the Health Sciences link leads to:

Health Sciences Homepage circa 1997

Final Tips

Wayback Machine Plugins  The Internet Archive also offers browser plugins to quickly navigate and share “archived” websites—an      important tool when encountering defunct websites or “link rot.”

  It has now become more common to use the archived website from the Internet Archive as the citation source due to these dead links. Granted, not all institutions and individuals allow their sites to be crawled by the Internet Archive, and may use other tools for capturing and saving websites. Yet, depending on the citation tool and standard, citing archived sites can help ensure sources remain reliable. Citation integration will continue to evolve with the popularity and growth of the Internet Archive.



Was this article helpful?
What made the article not helpful?
Last Revision: