The Internet lives on links. By clicking the links, users move between the pages of the sites. By publishing links, attention is drawn to interesting resources. Links are used by search engines to replenish the search base and calculate the parameters used in ranking. Links are the backbone of the web. That is why correct linking and correct link structure is so important for every resource. And that is why every webmaster should know the answer to the question of how to extract all the links of a site, an individual page or a group of pages.
It is necessary
- - the free program Xenu's Link Sleuth, available for download at
- - Internet connection.
Instructions
Step 1
Create a new project in Xenu's Link Sleuth. In the main menu of the application, select the "File" item and then "Check URL …", or press the keyboard shortcut Ctrl + N. In the "Xenu's starting point" dialog that appears, in the top field, enter the URL of the page from which you want to start extracting links. If necessary, fill in the fields in the "Include / Exclude" group of controls to add additional external addresses and address groups to the possible list of checking and forcibly blocking some addresses or groups of addresses from being indexed by the application.
Step 2
Set the parameters of the program. In the "Xenu's starting point" dialog, click the "More options …" button. The "Options" dialog will be displayed. Switch to the "Basic" tab of the dialog. Set the number of parallel threads downloading data from the Internet by moving the "Parallel Threads" slider. In the "Maximum depth" field, enter a value for the maximum depth for the application to view links. In the "Report" group of elements, activate or deactivate the options for generating the report. Switch to the "Advanced" tab. Activate or deactivate additional options. In the Retries box, enter a value for the maximum number of URL retries on failure. Click the "OK" button.
Step 3
Get a list of site pages and links. In the "Xenu's starting point" dialog, click the "OK" button. The application will start working. The status bar will display information about the progress of the data acquisition process. Wait for the process to complete. In the window with a request to create a report (window with the text "Link sleuth finished. Do you want a report?") Click the "No" button.
Step 4
Extract all links of a single page. In the list of pages that the application has built, find the page whose links you want to extract. Right-click on the corresponding line. In the context menu, select the "URL Properties" item. The field "… links on this page" of the displayed dialog will contain a list of all links present on the page. The field "… linking to this one" will contain the addresses of the pages linking to this one.
Step 5
Extract all site links. From the main menu select "File" and then "Export Page Map to TAB separated File…". In the dialog that appears, specify the name and path to save the file. The resulting file contains all the addresses of referring and targeting pages in the OriginPage and LinkToPage fields. The LinkToPageStatus field contains the values for the success of the data retrieval operation from the server. Import the file into a database (such as MS Access) to extract links based on your criteria.