SMF2.1:Search engines: Difference between revisions From Online Manual

Jump to: navigation, search
(→‎Spiders: clarified first paragraph)
m (Grammar Enhancements)
 
(9 intermediate revisions by one other user not shown)
Line 1: Line 1:
{{WIP}}
{{TOCright}}
{{TOCright}}


Please see versiontemplate for whichever version you are using.
{{Versions|versions2.0-2.1}}


In this area you can decide in how much detail you wish to track search engines and spiders as they index your forum, as well as review search engine logs. You can find this page at ''Admin Center > Forum > Search Engines''. It has four tabs or pages: Stats, Spider Log, Spiders, and Settings.  
In this area you can choose the extent to which you wish to track search engines and spiders as they index your forum, as well as review search engine logs. You can find this page at ''Admin Center > Forum > Search Engines''. It has four tabs or pages: Stats, Spider Log, Spiders, and Settings.  


==Stats==
==Stats==
Here you can view statistics that pertain to search engines indexing your forum. Note, only search engines that are listed on the [https://wiki.simplemachines.org/smf/SMF2.1:Search_engines#Spiders Spiders page] will have statistics tracked for it here. The statistics stored for search engines are cut off by one day increments, thus you will get the statistics related to the search engine throughout a day's span.  
Here you can view statistics that pertain to search engines indexing your forum. Note, only search engines that are listed on the [https://wiki.simplemachines.org/smf/SMF2.1:Search_engines#Spiders Spiders page] will have statistics tracked here. The statistics that are stored for search engines are captured per day, thus you will see the statistics related to the search engine over the course of a day.  


There are three columns in this table:
There are three columns in this table:
*'''Date''' - This is the date the spider indexed your forum.
*'''Date''' - This is the date the spider indexed your forum.
*'''Spider Name''' - This is the name of the spider that indexed your forum. The name comes from the name given to the spider on the [https://wiki.simplemachines.org/smf/SMF2.1:Search_engines#Spiders Spiders page or tab].
*'''Spider Name''' - This is the name of the spider that indexed your forum. The name comes from the name given to the spider on the [https://wiki.simplemachines.org/smf/SMF2.1:Search_engines#Spiders Spiders page or tab].
*'''Page Hits''' - This is the number of unique hits, or separate session visits, the spider performed on your forum.
*'''Page Hits''' - This is the number of unique hits, or separate session visits, that the spider performed on your forum.


At the bottom right of the page is a dropdown menu labelled ''Jump To Month'' where you can browse the statistics for whichever month you want by simply selecting that month.
At the bottom right of the page is a dropdown menu labelled ''Jump To Month'' where you can browse the statistics for whichever month you want by simply selecting that month.


==Spider Log==
==Spider Log==
This log tells you which of your forum pages were visited, and by which spider. Depending on the ''Search Engine Tracking Level'' selected on the [[SMF2.1:Search_engines#Settings|Settings]] page, this log can vary from showing the action of every spider, to not showing any activity at all. The Tracking Level must be set to ''Moderate'' or ''Aggressive'', to track every spider.
This log tells you which of your forum pages were visited and which spider visited them. Depending on the ''Search Engine Tracking Level'' selected on the [[SMF2.1:Search_engines#Settings|Settings]] page, this log can vary from showing the action of every spider, to not showing any activity at all. The Tracking Level must be set to ''Moderate'' or ''Aggressive'', to track every spider.


There are three columns in this log:
There are three columns in this log:
* '''Spider''' - This is the name of the spider that indexed your forum, according to the name given in the [https://wiki.simplemachines.org/smf/SMF2.1:Search_engines Spiders page].
* '''Spider''' - This is the name of the spider that indexed your forum, according to the name given in the [https://wiki.simplemachines.org/smf/SMF2.1:Search_engines#Spiders Spiders page or tab].
* '''Time''' - This is the date and time when the spider viewed a page.
* '''Time''' - This is the date and time when the spider viewed a page.
* '''Viewing''' - This is the page the spider was viewing when it visited. This will show ''Disabled'' if the ''Search Engine Tracking'' is set to ''High'' instead of ''Very High''.
* '''Viewing''' - This is the page the spider was viewing when it visited. This will show ''Disabled'' if the ''Search Engine Tracking'' is set to ''High'' instead of ''Very High''.
Line 27: Line 25:
===Delete Entries===
===Delete Entries===
Since higher tracking levels can result in a massive number of entries, there is a way to delete log entries at the bottom of this page. Once you enter a numeric value, select the ''Delete'' button to prune entries older than the specified amount of days.
Since higher tracking levels can result in a massive number of entries, there is a way to delete log entries at the bottom of this page. Once you enter a numeric value, select the ''Delete'' button to prune entries older than the specified amount of days.
You can also use Log Pruning (''Admin Center > Maintenance > Logs > Settings'') to automatically prune the Spider Log. Scroll down to the Log Pruning section, and check Enable Log Pruning. See [https://wiki.simplemachines.org/smf/SMF2.1:Logs#Log_Pruning Log Pruning] in the manual, to learn how to use it.


==Spiders==
==Spiders==
This table lists all the spiders which your forum recognizes, along with a few details. SMF provides approximately 25, and you can add more or delete spiders, if you don't care to track them, for some reason.
This table lists all of the spiders which your forum recognizes, along with a few details. SMF provides approximately 25 spiders, and you can add more, or delete spiders, if you do not wish to track them.
 
These are the four columns in this table:
 
*'''Spider Name''' - This is the name of the spider that indexed your forum, according to the name given in the spiders section. Note that these are shown as links and clicking the links will lead you to a page that allows you to modify the details for the spider.
*'''Last Seen''' - The date and time the spider last indexed your forum.
*'''User Agent''' - The User Agent the spider is detected by. For more information on what a user agent is, search for "user agent" in the search engine of your choice. In general terms, the user agent is one way your forum can identify what viewing the forum is actually a spider.
*'''IP Addresses''' - Similar to user agents, IP addresses are another way to identify a spider. Listed here is a comma separated list of IP addresses associated with the spider.


Below this table you can see two buttons:
These are the five columns in this table:
 
*'''Spider Name''' - This is the name of the spider that indexed your forum, according to the name given on the [https://wiki.simplemachines.org/smf/SMF2.1:Search_engines#Spiders Spiders page]. Note that these are shown as links and clicking the links will lead you to a page that allows you to modify the details for the spider.
*'''Add New Spider''' - This button will lead you to a new page to add a new spider to your forum.
*'''Last Seen''' - This is the date and time the spider last indexed your forum.
*'''Remove Selected Spiders''' - To the far right of each spider listed on this page is a checkbox that you can select. Selecting this button will completely delete all of the spiders that are checked on this page.
*'''User Agent''' - In general terms, the user agent is the code name for the spider. Try WikiPedia or your favorite search engine, to learn more about this.
*'''IP Addresses''' - IP addresses identify specific computers or servers. The IP address column shows the address where each spider comes from.
*'''Checkbox''' - Putting a check in a box, and clicking the ''Delete Selected'' button at the bottom, allows you to remove spiders from the list.


=== Adding/Editing Spiders ===
=== Adding/Editing Spiders ===
When you select a link of a spider listed on this page or select the ''Add New Spider'' button you will be led to a page that allows you to define certain settings for the spider. The following settings are:
You can add a new spider by clicking the ''Add New Spider'' button, which is right beside the ''Delete Selected'' button, in the bottom, right corner of this page. To edit an existing spider, click on its name in the table. In either location, you will be able to add or edit the following fields:


*'''Spider Name''' - The name you want to give the spider. This can be any name that you want, but you should give it a name so you can easily identify the spider when you come across it viewing your logs and statistics.
*'''Spider Name''' - This is the name of the spider.
*'''User Agent''' - The user agent of the spider. User agents are one way to identify spiders. To find out the user agent of a spider, try searching for "user agent <search engine name>" in the search engine of your choice, and chances are one of the first results will tell you what the user agent is.
*'''User Agent''' - User agents are one way to identify spiders. To find out the user agent of a spider, try searching for "user agent <search engine name>" in the search engine of your choice. The chances are that one of the first results will tell you what the user agent is.
*'''IP Addresses''' - The IP addresses that you want to identify for the search engine. This is similar to the user agent and you can find out the IP address in similar ways as to what is mentioned above for the user agent.
*'''IP Addresses''' - You can identify the IP address as described above for the user agent. In your favorite search engine, enter "IP address <search engine name>".


Note that only one of the two identifying fields for the spider are required, either the IP address or user agent. You can, however, input values for both if desired.
Note that only one of the two identifying fields for the spider are required, either the IP address or user agent. You can, however, input values for both if desired.


==Settings==
==Settings==
You can change settings for spider tracking from this page. Note, if you wish to enable automatic pruning of the hit logs you can set this up from [[logs#Log_Pruning]].
You can change the tracking level and other settings for spider tracking from this page.
 
===Search Engine Tracking Level===
This determines the level at which spider activity is logged.  Be aware that higher tracking level increases server resource requirement.
*'''Disabled''' - Spider activity is not logged at this setting.
*'''Standard''' - Minimal spider activity is logged at this setting.
*'''Moderate''' - More accurate statistics about spider activity are logged for every spider at this setting.
*'''Aggressive''' - All possible statistics for every spider visit are logged at this level.


*'''Search Engine Tracking Level''' - Determines the level at which spider activity is logged.  Be aware that higher tracking level increases server resource requirement.
===Apply restrictive permissions from group===
**'''Disabled''' - Spider activity is not logged.
This option allows you to prevent spiders from indexing certain pages, such as member profile pages.
**'''Standard''' - Minimal spider activity is logged.
*'''Disabled''' - Spiders do not belong to any restrictive group.
**'''High''' - More accurate statistics about spider activity are logged.
*'''List of groups''' - By selecting a particular group, when a guest is detected as a spider, it will automatically be assigned any deny permissions which this group possesses, in addition to the normal permissions of a guest. You can use this to provide lesser access to a search engine than you would a normal guest. For example, you might wish to create a new group called "Spiders" and select that here. You could then deny permission for that Spider group to view profiles, to stop spiders indexing your members profiles.  Note that spider detection is not perfect and can be simulated by users. So this feature is not guaranteed to restrict content only to those search engines you have added.
**'''Very High''' - The same as high, but logs data for each page visited.
*'''Apply restrictive permissions from group''' - Enables you to prevent spiders indexing some pages.
**'''Disabled''' - Spiders do not belong to a restrictive group.
**'''List of groups''' - By selecting a restrictive group, when a guest is detected as a search crawler it will automatically be assigned any deny permissions of this group, in addition to the normal permissions of a guest. You can use this to provide lesser access to a search engine than you would a normal guest. You might for example wish to create a new group called "Spiders" and select that here. You could then deny permission for that group to view profiles to stop spiders indexing your members profiles.  Note that spider detection is not perfect and can be simulated by users so this feature is not guaranteed to restrict content only to those search engines you have added.
*'''Show spiders in the online list''' - Determines whether spiders are displayed in the online list, and which members can see them.
**'''Not at all''' - Spiders will simply appear as guests to all users.
**'''Show spider quantity''' - The Board Index will display the number of spiders currently visiting the forum.
**'''Show spider names''' - Each spider name will be revealed, so users can see how many of each spider is currently visiting the forum - this takes effect in both the Board Index and Who's Online page.
**'''Show spider names - admin only''' - As above except that only administrators can see spider status.  To all other users spiders appear as guests.


===Show spiders in the online list===
This option determines whether spiders are displayed in the online list, and which members can see them.
*'''Not at all''' - Spiders will simply appear as guests to all users.
*'''Show spider quantity''' - The Board Index will display the number of spiders currently visiting the forum.
*'''Show spider names''' - Each spider name will be revealed, so users can see which spiders are currently visiting the forum. This shows up on both the Board Index and Who's Online page.
*'''Show spider names - admin only''' - As above except that only administrators can see spider status.  To all other users spiders appear as guests.
{{ {{Localized|As an administrator 2.1}}}}
{{ {{Localized|As an administrator 2.1}}}}

Latest revision as of 10:39, 27 March 2024

Please see SMF2.0:Search engines or SMF2.1:Search engines depending on the version of SMF you are using.

In this area you can choose the extent to which you wish to track search engines and spiders as they index your forum, as well as review search engine logs. You can find this page at Admin Center > Forum > Search Engines. It has four tabs or pages: Stats, Spider Log, Spiders, and Settings.

Stats

Here you can view statistics that pertain to search engines indexing your forum. Note, only search engines that are listed on the Spiders page will have statistics tracked here. The statistics that are stored for search engines are captured per day, thus you will see the statistics related to the search engine over the course of a day.

There are three columns in this table:

  • Date - This is the date the spider indexed your forum.
  • Spider Name - This is the name of the spider that indexed your forum. The name comes from the name given to the spider on the Spiders page or tab.
  • Page Hits - This is the number of unique hits, or separate session visits, that the spider performed on your forum.

At the bottom right of the page is a dropdown menu labelled Jump To Month where you can browse the statistics for whichever month you want by simply selecting that month.

Spider Log

This log tells you which of your forum pages were visited and which spider visited them. Depending on the Search Engine Tracking Level selected on the Settings page, this log can vary from showing the action of every spider, to not showing any activity at all. The Tracking Level must be set to Moderate or Aggressive, to track every spider.

There are three columns in this log:

  • Spider - This is the name of the spider that indexed your forum, according to the name given in the Spiders page or tab.
  • Time - This is the date and time when the spider viewed a page.
  • Viewing - This is the page the spider was viewing when it visited. This will show Disabled if the Search Engine Tracking is set to High instead of Very High.

Delete Entries

Since higher tracking levels can result in a massive number of entries, there is a way to delete log entries at the bottom of this page. Once you enter a numeric value, select the Delete button to prune entries older than the specified amount of days.

You can also use Log Pruning (Admin Center > Maintenance > Logs > Settings) to automatically prune the Spider Log. Scroll down to the Log Pruning section, and check Enable Log Pruning. See Log Pruning in the manual, to learn how to use it.

Spiders

This table lists all of the spiders which your forum recognizes, along with a few details. SMF provides approximately 25 spiders, and you can add more, or delete spiders, if you do not wish to track them.

These are the five columns in this table:

  • Spider Name - This is the name of the spider that indexed your forum, according to the name given on the Spiders page. Note that these are shown as links and clicking the links will lead you to a page that allows you to modify the details for the spider.
  • Last Seen - This is the date and time the spider last indexed your forum.
  • User Agent - In general terms, the user agent is the code name for the spider. Try WikiPedia or your favorite search engine, to learn more about this.
  • IP Addresses - IP addresses identify specific computers or servers. The IP address column shows the address where each spider comes from.
  • Checkbox - Putting a check in a box, and clicking the Delete Selected button at the bottom, allows you to remove spiders from the list.

Adding/Editing Spiders

You can add a new spider by clicking the Add New Spider button, which is right beside the Delete Selected button, in the bottom, right corner of this page. To edit an existing spider, click on its name in the table. In either location, you will be able to add or edit the following fields:

  • Spider Name - This is the name of the spider.
  • User Agent - User agents are one way to identify spiders. To find out the user agent of a spider, try searching for "user agent <search engine name>" in the search engine of your choice. The chances are that one of the first results will tell you what the user agent is.
  • IP Addresses - You can identify the IP address as described above for the user agent. In your favorite search engine, enter "IP address <search engine name>".

Note that only one of the two identifying fields for the spider are required, either the IP address or user agent. You can, however, input values for both if desired.

Settings

You can change the tracking level and other settings for spider tracking from this page.

Search Engine Tracking Level

This determines the level at which spider activity is logged. Be aware that higher tracking level increases server resource requirement.

  • Disabled - Spider activity is not logged at this setting.
  • Standard - Minimal spider activity is logged at this setting.
  • Moderate - More accurate statistics about spider activity are logged for every spider at this setting.
  • Aggressive - All possible statistics for every spider visit are logged at this level.

Apply restrictive permissions from group

This option allows you to prevent spiders from indexing certain pages, such as member profile pages.

  • Disabled - Spiders do not belong to any restrictive group.
  • List of groups - By selecting a particular group, when a guest is detected as a spider, it will automatically be assigned any deny permissions which this group possesses, in addition to the normal permissions of a guest. You can use this to provide lesser access to a search engine than you would a normal guest. For example, you might wish to create a new group called "Spiders" and select that here. You could then deny permission for that Spider group to view profiles, to stop spiders indexing your members profiles. Note that spider detection is not perfect and can be simulated by users. So this feature is not guaranteed to restrict content only to those search engines you have added.

Show spiders in the online list

This option determines whether spiders are displayed in the online list, and which members can see them.

  • Not at all - Spiders will simply appear as guests to all users.
  • Show spider quantity - The Board Index will display the number of spiders currently visiting the forum.
  • Show spider names - Each spider name will be revealed, so users can see which spiders are currently visiting the forum. This shows up on both the Board Index and Who's Online page.
  • Show spider names - admin only - As above except that only administrators can see spider status. To all other users spiders appear as guests.

Main

Configuration

Forum

Members

Maintenance

Miscellaneous




Advertisement: